Hacker News new | past | comments | ask | show | jobs | submit login

Time-series is more about a specific use-case about data that has a primary time component (like sensor metrics). You can store it in any database, although the common ones are usually some sort of key/value or relational with specific features for time-based queries.

Hbase/Bigtable/DynamoDB/Cassandra are key/value. InfluxDB is key/value. Timescale is an extension to Postgres.




The big time TS databases (Sybase, KDB, Informix Datawarehouse) are column-based, not key value or traditional relational row-oriented. The ones you list are all lower-tier trying to shoehorn a time field on another model.


Those are still relational databases, just with column-oriented/column-store tables. I don't see how the storage layer changes the database type. For example, MemSQL has both rowstores and columnstores. Postgres 12 has pluggable storage with column-store (zedstore).


> I don't see how the storage layer changes the database type.

It does, because it leads to other types of optimization. LittleTable [0], for example, keeps adjacent data in time domain adjacent in disk. So querying large amount of data that are close to each other is efficient even on slow (spinning) disk. Vertica [1] does column compression which allows it to work with denormalized data (common in analytics workload) efficiently.

In an ideal world, you could have a storage layer sitting below a perfect abstraction; orthogonal to higher levels. In the real world, column-based and row-based are two completely different categories serving very different use-cases.

[0] https://meraki.cisco.com/lib/pdf/trust/lt-paper.pdf [1] http://vldb.org/pvldb/vol5/p1790_andrewlamb_vldb2012.pdf


The underlying database type hasn't changed. LittleTable is a relational database (it's the first sentence in the paper). Vertica is also a relational database.

Stored is an implementation detail. Optimizations are improvements to performance. Neither affects the fundamental data model, which in relational databases is relational algebra over tuple-sets.


Timeseries is the data model and that is, for the upper end, synonymous with column-oriented. In my top comment, I mean timeseries/column-oriented (there are other series besiudes time, byt they fit the same data model).

The top TS databases are more than just storage too. You need a query language that can exploit the ordering column-oriented gives you that the row-oriented relational doesn't.

On the lower end (eg, Timescale db) trying to fit a timeseries model on a row-oriented architecture which is a poor fit.


Time-series is definitely not synonymous column-oriented. The data model is separate from the storage layer which is separate from the use-case.

You're talking about relational databases (which is the formal type) designed for large-scale analytics using column-oriented storage and processing and supporting a time-series use-case.

Storage and querying just for time-series specifically is more about product features than the underlying type. For example, here's Pinterest doing the same on HBase: https://medium.com/pinterest-engineering/pinalyticsdb-a-time...


If you could store timeseries data "in any database", kdb wouldn't be a thing. Just go and ask a quant trader replace his kdb instance with postgres. (Be prepared to be laughed out of the room.)


I'm not sure what your point is. Time-series describes the data, not the database.

You can store timeseries data in Postgres if you want to (and optionally adding extensions like Timescale). You can store it in key/value like Redis or Cassandra. You can store it in bigtable. You can store it in MongoDB. Obviously different scenarios require different solutions.

KDB is a relational database with row and columnstore with features for time series and advanced numerical analytics along with a programming language. KDB is a thing because of those abilities, whether you use it for time-series data or not.


It is very deceptive to say that you can _store_ timeseries data in "Postgres ... Redis or Cassandra" so the nature of the data should not be used to categorize databases. You can "store" data in /dev/null if you never have to do anything with the data.

> I'm not sure what your point is.

My point is very simple - there is a category of databases widely accepted as "timeseries database", and they deserve a place in any conversation about "types of databases".


What category? The examples you used are formally relational databases. We can certainly talk about common use-cases for certain specific database vendors and products but that's not the same as the underlying type.

For example, here's Pinterest handling time-series data on Hbase: https://medium.com/pinterest-engineering/pinalyticsdb-a-time...

There's a big difference and muddying the definitions with marketing jargon ends up causing too much confusion in this industry.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: