> A relational database without relations is an oxymoron.
OK. You're the only one talking to this straw man though. :-) Every Vitess user that I'm aware of has a pretty typical 2NF/3NF schema design. A small sampling of them being listed here: https://vitess.io
You setup your data distribution/partitioning/sharding scheme so that you have data locality for 99.9999+% of your queries -- meaning that the query executes against a data subset that lives on a single shard/node (e.g. sharding by customer_id) -- and you live with the performance hit and consistency tradeoffs for those very rare cases that cross shard queries cannot be avoided (Vitess does support this). You should do this even if the solution you're using claims to have distributed SQL with ACID and MVCC guarantees/properties. There's no magic that improves the speed of light and removes other resource constraints. In practice most people say they want perfect security/consistency/<name your desired property here> but then realize that the costs (perf, resources, $$, etc) are simply so high that it is not practical for their business/use case.
I know MySQL fairly well (I started working at MySQL, AB in 2003) and you can certainly claim that "MySQL-compatible" is dishonest but I would offer a counter claim that either you don't know this space very well or you're not operating in good faith here.
To be fair, I skimmed through your docs and did misread them initially: I thought you don't allow foreign keys, but you actually don't allow foreign key constraints.
If you are still allowing JOINs within a shard, then I need to apologize.
To pile on your answer a bit the manual bucketing you describe is exactly how ClickHouse works in most cases. It won't allow joins / IN on multiple distributed tables--i.e., sharded/replicated tables--unless you explicitly set a property called distributed_product_mode. [0] This is to prevent you from shooting yourself in the foot either by bad performance or improperly distributed data.
This constraint will eventually be relaxed but most apps are able to work around it just fine. The ones that can't use Snowflake.
OK. You're the only one talking to this straw man though. :-) Every Vitess user that I'm aware of has a pretty typical 2NF/3NF schema design. A small sampling of them being listed here: https://vitess.io
You setup your data distribution/partitioning/sharding scheme so that you have data locality for 99.9999+% of your queries -- meaning that the query executes against a data subset that lives on a single shard/node (e.g. sharding by customer_id) -- and you live with the performance hit and consistency tradeoffs for those very rare cases that cross shard queries cannot be avoided (Vitess does support this). You should do this even if the solution you're using claims to have distributed SQL with ACID and MVCC guarantees/properties. There's no magic that improves the speed of light and removes other resource constraints. In practice most people say they want perfect security/consistency/<name your desired property here> but then realize that the costs (perf, resources, $$, etc) are simply so high that it is not practical for their business/use case.
I know MySQL fairly well (I started working at MySQL, AB in 2003) and you can certainly claim that "MySQL-compatible" is dishonest but I would offer a counter claim that either you don't know this space very well or you're not operating in good faith here.