That approach -- sending the schema with the message and having the receiving en...

nostrademons · on Nov 16, 2020

> the message has to be pretty large, or the schemas have to be negotiated at the start of a session that lasts a while

Those are the two main use cases for a binary serialization format, though. Either you tend to have a very large row-oriented file with lots of homogenous records, say a SSTable or BigTable or Hadoop SequenceFile or Postgres binary column. Or you have a long-lasting RPC connection that serves one particular service interface, where you'd want to negotiate the protocol used by the service at the beginning and keep it up for hours.

I can think of a couple exceptions, like end-user HTTP connections to a public API. But for those, you usually need a complicated capability-based protocol negotiation anyway, because you need to handle hostile (hacked) clients or old versions that are stuck on something from 5 years ago. Google's habit of sticking a custom message type in a MessageSet and sending it along for a feature that triggers on 0.1% of traffic isn't really a thing outside of Google (and possibly other FANGs), not least because most companies can't afford to staff a team to maintain a feature used by 0.1% of traffic.

The solution for complexity is pretty routine: hide it behind a library. I'm not terribly fond of the particular API that Avro provides, but the wire format is sound & simple and there's nothing preventing someone from writing alternative library implementations.

Guthur · on Nov 14, 2020

This is my major criticism of protobuf and by extension grpc. IMO it emphasises two salient points, size on the wire and centralised control. The first is laudable but in my mind is taken too far because of the second. A decentralised distributed system requires ease of discovery and a decoupling of server and client.

I'm well aware of the many reasons to build distributed systems, not least of all the ability to distribute engineering effort, and so can see that if team distribution is a primary motivator for creating a micro service system that there would be a desire to make it appear like it's actually all one process. Of course it isn't one process but I can see the desire.