How can we import CSV dataset? HTTP API for insert returns this error message: e...

paulasmuth · on July 26, 2016

We do have a csv import util (the API expects JSON) but it's not in the current distribution/release build. I'll add it and update this comment once it's live.

Queries are mostly limited by IO if running on regular hard disks. The number of rows/seconds mainly depends on the number and types of columns that are accessed. For example, if we scan 1.8B rows and only load a single integer column from disk (and the integers are small), we'll only have to load about 1 byte per row from disk (using an idealized model excluding some overheads for illustration purposes). If we want to complete the query in 1.5 seconds that would be a total IO load of 1144MB/s. So (depending on disk speed) around ~15 machines would suffice.