I'm assuming Yegge was referring to the RPC framework.

jrochkind1 · on Oct 1, 2019

"an internal communication system" does sound like something like an "RPC framework", but Yegge's paraphrase actually says "It doesn’t matter what technology they use. HTTP, Corba, Pubsub, custom protocols — doesn’t matter. Bezos doesn’t care."

I read this as saying different teams/services don't have to use the same thing either. That doesn't sound like an "RPC framework" or "an internal communications system" at all. It seems to leave the door open to everyone doing things in a diverse mishmash. Which isn't what I'd call "an internal communications system" at all.

But was/is there in fact an Amazon-specific "RPC framework", that all Amazon services use, some consistent framework used consistently accross services? I haven't heard much about this before so am curious to learn more. I haven't heard of an Amazon 'RPC framework' before, or what it's called, or what. And OP doesn't specify it either; does the rest of the audience know what's being talked about, and I'm just missing context?

If that is the thing that the OP thinks is really what Amazon got right... then the interesting thing is figuring out how it went from the paraphrased email, which doesn't actually demand such a thing, to.... such a thing. Who designed or chose this "RPC framework"? When? How? How'd they get everyone to use the same one? If that's the thing Amazon got right, there are some steps missing between the Yegge-paraphrased email and there, since the email doesn't actually even call for such a thing.

Or is that not what happened at all, and I'm still not sure what OP means by "an internal communication system" being the thing Amazon got right.

blandflakes · on Oct 1, 2019

This edict was before my time at Amazon, so I can't speak to whether there was an RPC framework in existence when this was mandated.

By the time I arrived, however, there was a cross-language RPC framework that integrated with Amazon's monitoring, request tracing, and build infrastructure (for building and releasing client versions). It was very full-featured and the de-facto system for creating a service. Most of our communication in my organization was done using this framework, and systems that violated the "only communicate over a service boundary" mandate were real problem children.

jrochkind1 · on Oct 2, 2019

Interesting, people don't talk about this much, although the OP seems to be aware of it and think it was important.

Does anyone know if there's been much written on how this came to be and what it looked like? If not, it would be a useful thing to write about!

Cause it does seem like a really important thing, without it, the narrative seems to be that you make a decree like Bezos', and bing bang magic, you get what AWS got. Where in fact, succesfully pulling off that RPC framework seems to be really important, and undoubtedly took a lot of work, good succesful design, and social organizing to get everyone to use it (perhaps by making it the easy answer to Bezos' mandate). But none of that stuff just happens, some have failed where AWS succeeded, the mandate alone isn't enough.

blandflakes · on Oct 2, 2019

I think a lot of Amazon's internal tooling is sort of "unpublished" - I've not found a great reference for a lot of the really excellent dev support they had.

The AWS story is particularly interesting because a lot of the internal setup I was doing at the time was on old fashioned metal. There was an internal project called Move to AWS (MAWS) that encouraged using newly-developed integrations with the AWS systems that the public was using.

In other words, AWS lived alongside old-fashioned provisioning practices up until even the early 2010s.

morelisp · on Oct 1, 2019

In my experience with much smaller teams (5-50 programmers) the main challenges are three-fold:

1) getting developers to talk about formalizing (to any degree) cross-system responsibility at all rather than just hanging code where it's easiest at-hand;

2) getting developers to think about external inputs/outputs at all and everything that entails, e.g. namespacing, versioning, forwards and backwards compatibility, validation, access control, ...

3) teaching developers to pick the right kind of interface for their data and processing, i.e. more or less queue vs. pubsub vs. RPC vs. REST vs. query language; of the three this is the only "technical" one, the others will be harder sells.

Once you have done these, the question of _which_ queue/etc. you take is largely irrelevant; there will be some natural pressure to standardize and even "unnatural" pressure from CTOs/management will be straightforward to implement.