Hacker News new | past | comments | ask | show | jobs | submit login

But those queries aren’t equivalent so how is anything saved by materializing the second one?

e:

I believe (I could be wrong!) you edited the second query from

  SELECT dimension, measure
    FROM table
    GROUP BY 1
To

  SELECT dimension, filter, measure
    FROM table
    GROUP BY 1, 2
This addresses the filtering but how is that any different from the original table? Presumably `table` could have been a finer grain than the filter and dimension but you’d do better to add the rest of the dimensions as well, at which point you’re most of the way to a star schema.

This kind of pre-computed aggregate is typical in data warehousing. But is it really an “OLAP cube”?

In general I agree there is value in the methods of the past and we would be well served to adapt those concepts to our work today.




By grouping them you need to include aggregate functions like avg,max,min,sum,count for measure and dimension.


It’s much smaller than the original table. If you compute lots of these, then voila, you have an OLAP cube.


Well the size savings is a function of the number of included dimensions and the original table grain.

I wouldn’t call this an “OLAP Cube”. It’s just an aggregated fact table. A collection of those with their corresponding dimensions is a “data mart”.


A data mart is a logical organization of data to help humans understand the schema. What I am describing is a physical optimization, extremely similar to what an OLAP cube would do, but implemented on top of a SQL data warehouse. It’s an orthogonal concept to a data mart.


I guess I still don’t know what an OLAP Cube is.


He did edit his comment, but unfortunately didn't acknowledge the edit.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: