Hacker News new | past | comments | ask | show | jobs | submit login

“high availability through Paxos based replication”. I thought Paxos is supposed to be a CP system.



HA isn't really about 100% availability. Any single system that promises such is extremely likely to be misleading you somehow. Your in-flight query is going to get interrupted no matter how fancy your clustering is, and I struggle to even come up with hypothetical use cases where this is something you can't afford to have happen, ever.

All you need is that in the event of a failure the clustered system can still recover quickly enough (to a well-defined state!) that the application layer can deal with the transient failure without significant impact on users, maintaining the illusion of availability.


> I struggle to even come up with hypothetical use cases where this is something you can't afford to have happen, ever.

a rocket is using this query to adjust their trusters ;)


Such a rocket would likely have two or more independent systems that would each have to agree on the adjustment, so one of them temporarily failing would not pose a problem. Though I doubt there are any rockets using database queries as part of their control system.

In those kinds of systems I suspect the approach is to enumerate every possible scenario and prove that the system behaves correctly in all of them, and if you can't do that, the system may be too complex and you need to redesign it to be simpler so that you can guarantee that it does not fail.


You have N >= 3 nodes, or N >= 2 and a non-compute arbitrator. One of them goes down, stops responding. You still have quorum, data processes continue just fine. That's high-availability in a CP system.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: