Hacker News new | past | comments | ask | show | jobs | submit login

Fred from IRONdb (http://irondb.io) here. First off, it's worth mentioning that "scalability" is a very relative term. Some people need to process a billion measurements per second.. there are very few databases out there that can handle that sort of task. If you're only idling along at something small that a single machine can easily handle (like a million data points per second), then the product doesn't need to scale very much at all. At that point you'd be looking for operability and manageability... how does it behave when everything else in the world aims to ruin your day?

When we started scaling Circonus, we realized we needed to solve this billion measurements per second problem on a system that scales near-linear with added nodes and has manageable failure and recovery scenarios... if we didn't solve that, we'd be worried about data all the time instead of building value on data. That's IRONdb.

What all of these TSDBs you mentioned have in common is that they tend to handle increasing workloads by just throwing a bigger box at the problem, but run into technical challenges when adding more nodes. How do they ensure data consistency and availability when multiple nodes are involved? Most of them accomplish this via consensus algorithms (paxos/raft/etc).

This introduces a number of technical problems that readers of the 'Call Me Maybe' blog series by Aphyr (https://aphyr.com/tags/jepsen) are familiar with. Essentially you end up with data inconsistency and corruption at high workloads to some degree for almost all of them, and sacrifice performance for consensus. At serious scale with petabytes of data, how does your TSDB detect and recover from the inevitable fallibility of disks (bit error rates being non-zero)? These are manageability questions that you run into at scale.

This was an issue we recognized was best avoided when we designed IRONdb, so we approached the problem by avoiding the need for consensus altogether through use of commutative operations. This means that it doesn't matter what order the end time series data is applied in; the result is the same. It allowed us to focus our efforts on read vs write performance, operational simplicity and the other points you mentioned.

I gave a high level technical talk on these issues a few months ago, https://www.youtube.com/watch?v=QBatpIFii7M

As always, I'd recommend taking each for a spin and seeing for yourself. None are perfect and each will surprise and disappoint you in unique ways when you use it in anger. This coming from someone with ample anger.

Disclaimer; I've been a Postgres user for 18 years.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: