Hacker News new | past | comments | ask | show | jobs | submit login

They have a typical master/slave setup with memcached on the front. http://www.slideshare.net/err/inside-github



Since they're using MySQL, is there a technical reason (as opposed to historical/lack of time reason) they're not using Galera Cluster?

I'm in the process of migrating an existing datastore to MariaDB+Galera, and so far it seems like everything I could hope for in a clustered RDBMS.


Last I investigated Galera it lacked support for query caching. Over 50% of our queries are cache hits, so it made it hard to justify using Galera over a normal master+slave setup. However I could see it being useful for setups where a single server can't handle the load (we average 300 queries/sec on a single server with lots of room to spare.)


They still disable the query cache, but MySQL's query cache generally isn't considered all that great a thing anyway, so few people care. You're better off making judicious use of Redis or memcached.

The biggest win for Galera is high-availability that actually works with minimal effort. (I've never experienced a high-availability solution not based on multi-master/all-nodes-hot principles that didn't cause more problems than it solved.)

They also claim some scalability wins at the front end, but I haven't really tested that, and am content with the performance not being terrible.


but MySQL's query cache generally isn't considered all that great a thing anyway

You've never had to prime a query cache on a MySQL server, have you? :)


Of course not. I use a caching layer with lower overhead that doesn't invalidate the entire cache when a single record changes. The query cache just isn't competitive.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: