Keep in mind you are still going to have lots of these same problems you mention...

agentS · on Feb 10, 2015

A binary protocol's parsing is usually something like read 2 bytes from the wire, decode N = uint16(b[0] << 8) | uint16(b[1]), then read N bytes from the wire. A text-based protocol's parsing almost always involves a streaming parser, which is tricky to get correct, and always more inefficient.

Besides, I think this is a moot point, because chances are that less than 100 people's HTTP2 implementations will serve 99.9999% of traffic. It's not like you or I spend much of our time deep in nginx's code debugging some HTTP parsing; I think its just as unlikely we'll be doing that for HTTP2 parsing.

Also, HTTP2 will always (pretty much) be wrapped in TLS. So its not like you're going to be looking at a plain-text dump of that. You'll be using a tool and that tool author will implement a way to convert the binary framing to human-readable text.

Another way to put it is that the vast majority of HTTP streams are not examined by humans and only examined by computers. Choosing a text-based protocol just seems a way to adversely impact the performance of every single user's web-browsing.

Another another way to put it is that there is a reason that Thrift, Protocol Buffers, and other common RPC mechanisms do not use a text-based protocol. Nor do IP, TCP, or UDP, for that matter. And there's a reason that Memcached was updated with a binary protocol even though it had a perfectly serviceable text-based protocol.

drawkbox · on Feb 10, 2015

Agreed on all points. Binary protocols are no doubt better, faster, more efficient and more precise. I use reliable UDP all the time in game server/clients. Multiplayer games have to be efficient, TCP is even too slow for real-time gaming.

Binary protocols work wonderfully... when you control both endpoints, the client and the server.

When you don't control both endpoints is where interoperability breaks down. Efficiency and exactness can be enemies of interoperability at times, we currently use very forgiving systems instead of throwing them out and assert crash dump upon communication error. Network data is a form of communication.

Maybe you are right, since it is binary, only a few hundred implementations might be made and those will be made by better engineers since it is more complex. Maybe HTTP is really a lower level protocol like TCP/UDP etc now. Maybe since Google controls Chrome and the browser lead and has enough engineers to ensure all leading implementations and OSs/server libraries/webservers are correct then it may work out.

As engineers we want things to be exact, but there are always game bugs not found in testing and hidden new problems that we aren't weighing against the known current ones. Getting a something new is nice because all the old problems are gone, but there will be new problems!

It will be an all new experiment we try going away from text/MIME based to something more lower level, complex and exact over simple and interoperability focused. Let's see if the customers find any bugs in the release.

MichaelGG · on Feb 10, 2015

>Binary protocols work wonderfully... when you control both endpoints, the client and the server.

IP is all binary and I don't think it's a case of one party controlling all endpoints.

MichaelGG · on Feb 10, 2015

Binary protocols are usually far easier to implement both sending and receiving. There is far less ambiguity.

In fact, the newline problem I mentioned? It was not easier to diagnose, and was only caught by using tools checking it as a binary structure.

Postel was just flat wrong, and history shows us this is so. JSON is popular because it was automatically available in JavaScript, and people dislike the bit if extra verbosity XML introduces. JSON is also a much tighter format than the text parsing the IETF usually implements.

Postel's law also goes against the idea of failing fast. Instead, you end up thinking you're compliant, because implementations just happen to interpret your mistake in the right way. Then one day something changes and bam, it all comes crashing down. Ask any experienced web developer the weird edge cases they have to deal with, again from Postel's law.

And anyways, you know what everyone uses when debugging the wire? WireShark or something similar. Problem solved. Same for things like JSON. Over the past months I've been dealing with that a lot. Every time I have an issue, I pull out a tool to figure it out.

Do you know the real reason for the text looseness? It's a holdover from the '60s. The Header: Value format was just a slight codification of what people were already doing. And why? Because they wanted a format that was primary written and readby humans, with a bit if structure thrown in. Loose syntax is great in such cases. Modern protocols are not written and rarely read by humans. So it's just a waste of electricity and developer time.