It reminds me of the often optimization story (how we got 2x+ faster by not doin...

It reminds me of the often optimization story (how we got 2x+ faster by not doing very inefficient things), in this case going 6->3 replicas.

Example: TiDB at a certain time didn't write rows clustered by the primary key on disk (they had a separate index). This is very costly in distributed setups (less costly on single-node setups like PostgreSQL).

There are many such cases in many dbs. Another point lacking in most dbs is the "lsm compaction overhead" you need to do for all replicas when you're not using shared distributed storage.

This optimization can be seen on QuickWit (building/compacting inverted index is even more expensive than LSM compaction).