Quantile Digest (Q-digest) or something similar is what I believe is desired here.
From what I understand it's a fixed size data structure, represents quantiles as a tree of histogram bands, pruning nodes with densities that vary the least from their parents to achieve error / size trade-off. They also have the property that you can merge them together and re-compress in order to turn second data into minute data, or compress more accurate (large) archival digests into smaller ones to say, support stable streaming of a varying number of metrics across a stream of varying bandwidth by sacrificing quality.
They're pretty simple because they're designed for sensor networks, but I think you could design similar structures with a dynamic instead of fixed value range, and variable size (prune nodes based on error threshold instead of or in addition to desired size).
If anyone knows of a time-series system using something like this, I'd love to learn about it.
From what I understand it's a fixed size data structure, represents quantiles as a tree of histogram bands, pruning nodes with densities that vary the least from their parents to achieve error / size trade-off. They also have the property that you can merge them together and re-compress in order to turn second data into minute data, or compress more accurate (large) archival digests into smaller ones to say, support stable streaming of a varying number of metrics across a stream of varying bandwidth by sacrificing quality.
They're pretty simple because they're designed for sensor networks, but I think you could design similar structures with a dynamic instead of fixed value range, and variable size (prune nodes based on error threshold instead of or in addition to desired size).
If anyone knows of a time-series system using something like this, I'd love to learn about it.