From 2014 there's a video link and slides here: https://www.erlang-factory.com/s...

From 2014 there's a video link and slides here: https://www.erlang-factory.com/sfbay2014/rick-reed FYI: I've seen comments from after I left that wandist isn't used anymore; I think a lot of what we gained with that was working around issues in pg2 that stem from global locks not scaling... but the new pg doesn't need global locks at all. There was also some things in wandist to work around difficulties communicating between SoftLayer nodes and Facebook nodes, but that was a transitory need. See the 2024 presentation, 40k nodes!

Fairly similar, but smaller numbers in 2012 http://www.erlang-factory.com/conference/SFBay2012/speakers/...

The 2013 presentation is focused on MMS which I don't remember if it was as impressive: http://www.erlang-factory.com/conference/SFBay2013/speakers/... (note that server side transcoding is from before end to end encryption)

I don't think there were similar presentations on Erlang in the large at WhatsApp after that. Big changes between 2014 and 2019 (when I left) were

a) chat servers started doing a lot more, and clients per server went down on the big SoftLayer boxes

b) hosting moved from SoftLayer to Facebook and much smaller nodes --- also chat servers at SoftLayer were individually addressable, using (augmented) round robin DNS to select, at Facebook the chat servers did not have public addresses, instead everything comes in through load balancers

c) MMS was pretty much offloaded into a Facebook storage service (c++); not because the Erlang wasn't sufficient, but because MMS was loosely coupled with the rest of the service, Facebook had a nice enough storage service, a lot of storage, an awful lot of bandwidth, and it wasn't a lot of work for that team to also support WhatsApp's needs; also our Erlang MMS (and the PHP version before it) was built around storing files on specific, addressable nodes, but nodes at Facebook are much more ephemeral and not easy to directly address by clients

d) some amount of data storage moved off of mnesia into other Facebook data storage technology; again, not because mnesia wasn't sufficient, but more ephemeral nodes makes it cumbersome (addressable) and the available hardware nodes at FB didn't really match --- there's a very firm bias at FB towards using standard node sizes and the available standard nodes were like a web machine with not much ram or a big database machine with more ram and tons of fast storage; WA mnesia wants lots of ram but doesn't need a lot of storage (all data is in ram, and dumped + logged to disk) so there was a mismatch there --- things that stayed in mnesia needed much larger clusters to manage data size

Presentations became less common because of more layers to get approval, and also because it's less fun to share how we built something on top of proprietary layers that others don't really have access to. Anybody could have gotten dual 2690 servers at SoftLayer and run a nice Erlang cluster. Only a few people could run an even bigger chat cluster in a Facebook like hosting environment.