> What do you love so about PowerBI? For a large portion of my career, the dashb...

jaakl · 2024-09-27T09:30:03 1727429403

The problem of semantic models what I've seen in tools like Looker, Tableau and Qlik (very probably same for PowerBI) is that they are tightly coupled to the tool itself, work within them only. Now you want "modern data system" then you want them decoupled and implemented with an open semantic model which is then accessable by data consumers in Google spreadsheets, Jupyter notebooks and whatever BI/Analytics/reporting tools your stakeholder uses or prefers. There are very new solutions for this like dbt semantic models; their only issue is that they tend to be so fresh that bigger orgs (where they do make most sense) may be shy on implementing them yet. To the original topic - not sure how much PG17 can be used in these stacks, usually much better are analytical databases - BigQuery, Snowflake, maybe Redshift, future (Mother)Duck(db)

greggyb · 2024-09-27T13:16:05 1727442965

The semantic model in Power BI is not tightly coupled to the tool. It is an SSAS Tabular model. It is pretty trivial to migrate a Power BI model to Analysis Services (Microsoft's server component for semantic models, hostable on-prem or as a cloud offering).

Both Power BI and Analysis Services are is accessible via XMLA. XMLA is an old standard, like SOAP old, much older than dbt.

XMLA provides a standard interface to OLAP data and has been adopted by other vendors in the OLAP space, including SAP and SAS as founding members. Mondrian stands out in my mind as an open source tool which also allows clients to connect via XMLA.

From what I can see, dbt only supports a handful of clients and has a home-grown API. While you may argue that dbt's API is more modern and easier to write a client for (and I'd probably agree with you! XMLA is a protocol, not a REST API), the point of a standard is that clients do not have to implement support for individual tools.

And of course, if you want API-based access there is a single API for a hosted Power BI semantic model to execute arbitrary queries (not XMLA), though its rate limits leave something to be desired: https://learn.microsoft.com/en-us/rest/api/power-bi/datasets...

Note the limit on "one table" there means one resultset. The resultset can be derived from a single arbitrary query that includes data from multiple tables in the model, and presented as a single tabular result.

Note: XMLA is an old standard, and so many libraries implementing support are old. It never took off like JDBC or ODBC did. I'm not trying to oversell it. You'd probably have to implement a fair bit of support yourself if you wanted a client tool to use such a library. Nevertheless it is a protocol that offers a uniform mechanism for accessing dimensional, OLAP data from multiple semantic model providers.

With regard to using PG17, the Tabular model (as mentioned above, shared across Power BI and Analysis Services) can operate in a caching/import mode or in a passthrough mode (aka DirectQuery).

When importing, it supports a huge array of sources, including relational databases (anything with an ODBC driver), file sources, and APIs. In addition to HTTP methods, Microsoft also has a huge library of pre-built connectors that wrap APIs from lots of SaaS products, such that users do not even need to know what an API is: these connectors prompt for credentials and allow users to see whatever data is exposed by the SaaS that they have permissions to. Import supports Postgres

When using DirectQuery, no data is persisted by the semantic model, instead queries are generated on the fly and passed to the backing data store. This can be configured with SSO to allow the database security roles to control what data individual users can see, or it can be configured with security roles at the semantic model layer (such that end users need no specific DB permissions). DirectQuery supports Postgres.

With regard to security, the Tabular model supports static and dynamic low-level and object-level security. Objects may be tables, columns, or individual measures. This is supported for both import and DirectQuery models.

With regard to aggregation, dbt seems to offer sum, min, max, distinct count, median, average, percentile, and boolean sum. Or you can embed a snippet of SQL that must be specific to the source you're connecting to.

The Tabular model offers a full query language, DAX, that was designed from the ground up for expressing business logic in analytical queries and large aggregations. Again, you may argue that another query language is a bad idea, and I'm sympathetic to that: I always advise people to write as little DAX as possible and to avoid complex logic if they can. Nevertheless, it allows a uniform interface to data regardless of source, and it allows much more expressivity than dbt, from what I can see. I'll also note that it seems to have hit a sweet spot, based on the rapid growth of Power BI and the huge number of people who are not programmers by trade writing DAX to achieve business goals.

There are plenty of pain points to the Tabular model as well. I do not intend to paint a rosy picture, but I have strived to address the claims made in a way that makes a case for why I disagree with the characterization of the model being tightly coupled to the reporting layer, the model being closed, and the model being limited.

Side note: the tight coupling in Power BI goes the other way. The viz layer can only interact with a Tabular model. The model is so broad because the viz tool is so constrained.

jaakl · 2024-09-27T18:02:56 1727460176

Thanks for the writeup. There are indeed use cases, especially in the MS multiverse. Proof of the none->basic->complex “can do everything” (soap,xml,rpc)->radically simpler “do what really matters” (rest, json, markdown) path. I’m not really sure if dbt semantic layer is the final open “standard” for the future analytical models and metrics, it has own questionmarks, it is literally just a transformer with metrics as addon and there are just initial implementations, but today I’d rather give that thing a try. Simpler is so much better

bigger_cheese · 2024-09-27T01:47:45 1727401665

My Org has been heavily pushing us to use Power BI, I have found it has a lot of pain points.

I expect how useful it is depends a lot on your use cases. If you want to throw a couple of KPI's on a dashboard it fulfils that requirement pretty well but if you want to do any analytics (beyond basic aggregations like, min, max and median) or slightly complicated trending (like multiple Y axis) Power BI is painfully complicated.

systems · 2024-09-27T03:22:39 1727407359

PowerBI might be the only BI solution with a decent functional language

DAX is supper powerful, if you design your datawarehouse as a proper stat scheme, there is nothing you cannot calculate using DAX

Nothing in the market is even close, most BI tool have magic expression language, DAX is a proper language

bigger_cheese · 2024-09-27T05:50:56 1727416256

I'm sure it can work but I've found it hard to slot into our use cases.

Power BI's trending is one thing I have really struggled with the limitations it has.

For one of my use cases (industrial plant data) we often want to trend timeseries data on a common xaxis with uniquely scaled yaxis.

i.e Temperature, Pressure, Flow Rate, Fill height etc all have differently scaled yaxis but common xaxis.

Something like this example https://imgur.com/a/zd5Giom

Power BI limits you to a single primary and a single secondary Y axis only (i.e limit of two scales).

systems · 2024-09-27T06:41:10 1727419270

sorry weird question, did you keep changing this link? was one of the link on powerbi.tips

can you put that back, for some reason my browser history lost it , or something weird happened, not sure what it is

bigger_cheese · 2024-09-29T22:40:32 1727649632

I edited the link once to a graph that better illustrated what I meant both results came from a google image search.