Hacker News new | past | comments | ask | show | jobs | submit login

Hi Peter, as the blog post talks about, our distributed hypertables typically partition by both time _and_ "space" (i.e., some other column like device id, etc.) as a way to better parallelize I/O (reads & writes) for the "current" time. That is, each time slice is typically spread across all nodes that existed when the time interval was opened. So this greatly ameliorates the interesting "time" problem you mention.

Now, if this time/space partitioning alone isn't sufficient (i.e., demand for a single device/userid/etc at a specific time overcomes the read capacity of K nodes), having time-series data being primarily insert heavy (or even immutable) also gives us a lot of flexibility about how we replicate (as a sibling comment also suggested). And what really helps is that, by design, the architecture we built tracks fine-grained chunk information (rather than just course-grained hash-partitions), which can enable dynamic replication of individual chunks. More on this to come.




I was disappointed to see that Adaptive Chunking is deprecated[1]. Are there future plans to ~replace this functionality?

[1] https://docs.timescale.com/latest/api#set_adaptive_chunking


We deprecated Adaptive Chunking because we weren't thrilled with the way it was working. But yes we are looking into an improved way of solving this problem.


It would be great if you could share with us how this feature has been working out for you and how we can improve it in the future.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: