The Blink Protocol

bkirwi · on Jan 20, 2013

Glanced through the spec and the comparison to protocol buffers on their blog. [0] A couple of quick notes:

- The messages are prefixed with the message length; this allows has a variety of advantages (ie. splitting up messages with minimal parsing) but forces you to know your message size in advance. In the general case, this requires processing the full structure before the first byte can be written. (On the other hand, protocol buffers can be written incrementally, with a single pass over the data and using constant memory.)

- Since the blink format contains no metadata, the recipient must have the schema with which the message was written. If schema migrations must be supported, this requires prefixing each batch of messages with the (encoded) schema. This may only be efficient when the batch size is large. (See Apache Avro [1] for another format that makes this tradeoff; since Avro is intended for processing in Hadoop, where the typical dataset has some millions of records, prefixing with the schema is not significant overhead.)

[0] http://blog.blinkprotocol.org/2013/01/blink-compared-to-goog... [1] http://avro.apache.org/docs/current/spec.html

tripzilch · on Jan 20, 2013

There was this recent security presentation about formal language-theoretic reasons why message-length prefixes are a bad idea. http://www.cs.dartmouth.edu/~sergey/langsec/

The problem was that it causes unnecessary complexity in validation and parsing, as <length-of-message><message>... describes a context-sensitive grammar, while a more XML-, JSON-like syntax with delimiters <delim><message></delim>... is context-free. The complexity of proper CSG validation can be exploited for resource-depletion/DOS attacks, while not validating the syntax of a message leaves part of the program that uses the data in these messages with potentially invalid data, putting it into an inconsistent state, possibly leading to DOS or worse exploits.

At least that's what I got from their presentation, which was very interesting BTW, especially the part about "weird machines"--that's basically the inconsistent state, the emergent machine that is made out of all the things an attacker can make a system do by exploiting it. It was also extremely geeky.

It just left me wondering: if I understood this right, every protocol that uses message-length prefixes is at least vulnerable to resource-depletion attacks, and there's no way of solving this short of proving P=NP. But that's quite a lot of protocols! Why isn't the Internet burning down as we speak?

kryptiskt · on Jan 20, 2013

But in any normal design you never put the message length into the same parser, you use it to read the message, and the parser is fed the message when you have received that many bytes.

For the reader a message consists of <length>+<blob>+<length>+<blob>+ etc. It can be in two states, having enough data, so pass the blob on to the parser (and it doesn't have to validate that the blob is length bytes, we know that already); or awaiting more data (you can attack that by dribbling data slowly, but you can do that with any protocol).

EDIT: Ok, I used a flat structure there, it would be different in a recursive structure with length on every element. But a top level reader that reads <length> bytes of the outer wrapper and doesn't muck about with the innards is still a sensible thing.

tripzilch · on Jan 21, 2013

First of, this isn't about deeply nested structures, the flat example demonstrates the problem perfectly fine :)

> But in any normal design you never put the message length into the same parser, you use it to read the message, and the parser is fed the message when you have received that many bytes.

Won't work, this is a fundamental problem in computing science.

Splitting the parser into a pre-parser and a post-parser isn't going to help solve the fundamental problem, because the combination of two parsers is still a parser.

One of the problems is, you cannot distinguish <blob> bytes from <length> bytes. If the data stream gets out of sync with the parser state (hiccup, dropped packet), you have a very non-trivial problem on your hands. A context-free grammer however, is free of context (ohh!) and can therefore resync in time proportional to how deep it's nested.

Speaking of nesting, that's another bit where I expect CSGs to become incredibly hairy: Of course you can use a hybrid approach: length-prefixed messages for the "outer stream", and a context-free XML/JSON/Lisp style format (delimiters on both sides[0]) for recursive structures. But why would you do that? If you wanted to save bytes by avoiding the delimiters on the very outer structures, there's a lot more of them to be saved if you apply the same "optimization" to any inner recursive structures. If you don't know what I'm talking about, think about how a tree-like recursive datastructure is represented in the memory of a C program. Yes pointers. Alternatively you could length-prefix them like before, C programs don't do that because you need to scan through everything and it's less efficient. Regardless, both approaches are context-sensitive and good luck on distinguishing malformed data from correct ones.

Now, this whole "formal languages and automatons" is a very complex subject matter[1], so while it may seem that the whole argument hinges on dropping a packet and desyncing the parser[2], I got the feeling from that talk that there are other (similarly fundamental) problems, but this particular one I understood and can make a compelling argument for :)

[0] afaik you might actually get away with a delimiter on just one side, but that makes it harder to parse because you need strict precedence rules to resolve ambiguities (e.g. 2+34+182+1+1+74321+0)

[1] it was considered one of the hardest courses during my CS college years (the other one being on "formal proofs of program correctness"), for various reasons I retook this course 4 times (underestimating its difficulty at first being one of those reasons), but when I finally did pass, I did so with a score of 9 out of 10, I'm kinda proud of that :P But the real* benefit of studying 4 times for the same difficult course is that you never really forget it (some parallels there with that post about "spaced repetition learning" last week).

[2] another thing they recommended that makes a lot of sense, but again is a parsing complexity (security) vs bandwidth efficiency trade-off: to make the delimiters (say, parentheses) to be out-of-band characters. so they're not allowed in binary blobs. this saves you from all sorts of escaping exploits (think XSS), makes resyncing more efficient and parsing a lot easier. of course it's really hard to step out of the "we really need all 8 bits in a byte"-paradigm, or how else can you design data formats with out-of-band characters? I don't know, and the talk I watched didn't give a solution either, just that it would be a good idea (to which I agree).

cnvogel · on Jan 20, 2013

Also in applications where bit-errors (or disconnections) are to be expected, the <length><data_of_length_length> ... approach makes it hard to re-synchronize after errors. (think of a serial cable that might be dis/reconnected, or radio-frequency links.)

A more robust encoding reserves some bit-pattern as a <start-of-message> marker, and escapes regular appearance of this marker in the payload with some special sequence. (see, for example, http://en.wikipedia.org/wiki/High-Level_Data_Link_Control ).

When you start putting (for example) single "blink"-messages in distinct payloads in such a framing container, then the <length-of-message> byte of blink becomes redundant, of course.

randersson · on Jan 21, 2013

We discussed length preambles vs markers when we designed the FAST protocol 8 years ago. That time we made length preambles optional (there are two "modes"). The nature of the FAST encoding made it impractical to use markers even though it would have been entirely possible. This time around we decided to use length preambles. Blink focuses on encoding short messages, so streaming works well with preamble. We expect bit-level errors, disconnects etc. to be handled by a lower layer. Rolf

m_eiman · on Jan 20, 2013

if ( msg.total_length > MAX_ALLOWED_SIZE ) { reset_parser(); }?

Should take care of any resource depletion issues, and do it way faster than eating data waiting for an end of field marker would.

mcherm · on Jan 20, 2013

But it produces a protocol which is incapable of handling messages greater than a certain length. The marketplace is littered with such protocols, but they're all useless since they can no longer even transmit a single wordprocessing document now that a page of text takes up megabytes of overhead.

m_eiman · on Jan 21, 2013

It's not every protocol that must handle arbitrary messages, that doesn't make them useless. Having a 9600 baud half duplex connection for messaging that requires low latency is quite different from 1Gbps Ethernet, and that makes every byte count.

teraflop · on Jan 20, 2013

Also from the protocol buffer comparison:

"Finally, Blink defines a method of transferring a serialized schema that can be used to notify a receiver of a Blink message stream of the structure of the transferred messages. A similar method could be added on top of GPB, but would again be non-standard."

I think this is incorrect; the protobuf library includes exactly the same thing, unless I'm misunderstanding what the Blink authors are talking about. protoc can turn a textual schema into a protobuf-serialized message that describes it, which a receiver can use to decode messages of that type with no prior information:

https://code.google.com/p/protobuf/source/browse/trunk/src/g...

https://developers.google.com/protocol-buffers/docs/referenc...

randersson · on Jan 21, 2013

I was maybe a little terse in my blog post. I agree that you can certainly do this, but it seems G decided that it was not worth adding as a core feature. At least that is my understanding from reading their docs. Blink on the other hand relies heavily on schema exchange.

Rolf

seliopou · on Jan 20, 2013

Avro as a data serialization format makes sense. As an RPC protocol, I think that dream died a long time ago.

If you don't transmit the schema of the message, you're in the same spot as Thrift or Protocol Buffers. Which, then, leaves you with that eternal question: pourquoi, ets-ce que?

eckzow · on Jan 20, 2013

In re your first bullet: If such things are important to you, it looks like you can produce fixed size packets through a variety of methods. In particular, they support overlength encodings.

bkirwi · on Jan 20, 2013

True; but it's hard to support a fixed-length encoding when a message can contain nested collections of arbitrary length.

drosenborg · on Jan 31, 2013

The just released Blink Native binary format might be an alternative if you want a fixed-length encoding: http://blog.blinkprotocol.org/2013/01/blink-native-format-re...

badgar · on Jan 20, 2013

> Since the blink format contains no metadata, the recipient must have the schema with which the message was written. If schema migrations must be supported, this requires prefixing each batch of messages with the (encoded) schema. This may only be efficient when the batch size is large.

This is essentially a non-starter for me after a couple years of building distributed systems using protocol buffers, for which protocol evolution is the first feature.

jamesaguilar · on Jan 20, 2013

They should just assign each schema a unique id, and have the server error out when receiving an ID it doesn't know. Then the client sends over the schema and the server caches it until death. This means the first request bears one additional round-trip, but the average request just carries an extra few bytes of identification.

afuruhed · on Jan 20, 2013

Please have a look at the schema exchange specification. We welcome feedback. It is too costly to include meta data in every message. http://blinkprotocol.org/s/BlinkSchemaExchangeSpec-beta1.pdf

Anders

cjensen · on Jan 20, 2013

Another binary message format? Another language for describing the format instead of XML or JSON?

"Why?" is the big question which sits there unanswered.

cbsmith · on Jan 20, 2013

I'm with you on this. When Thrift came out, it kind of made sense, because there wasn't really anything quite like it (closest I'd say was Hessian) in terms of the mix of simplicity, performance, etc. When Protocol Buffers came out, it kind of made sense, because Google had worked on it prior to Thrift even existing (indeed, it inspired Thrift). When Avro came out, I was like, WTF!? This is making that mistake yet again!!

EDIT: Damn, I forgot to mention MessagePack. ;-)

mullr · on Jan 20, 2013

Avro is useful because it also defines a container format which is easily splittable by Hadoop.

cbsmith · on Jan 20, 2013

Yeah, but one could just define a container format. It could have been done with protocol buffers in fact. I know, because I have implemented it.

billwashere · on Jan 20, 2013

I have no idea why they decided but my guess is that

* XML and JSON can be slow.

* XML parser are big

* XML file size is big

* JSON lacks proper binary support

* Thrift defines a entire transport stack, so if you already have you already have a transport layer, you are adding more bloat

* Some people may find protocol buffers and thrift files too hard/tedious

* Why not? Sometimes it's easier/more fun to roll out your solution

DHowett · on Jan 20, 2013

XML and JSON violate the following key property (which can be interpreted as a requirement):

- Suitability for FPGA implementation

cjensen · on Jan 20, 2013

I meant the specification language, not the formatting,

randersson · on Jan 21, 2013

ease of use. It is a much nicer syntax to use when authoring message specs. There are equivalent XML and JSON syntax variants for Blink schemas for those who'd prefer that.

Rolf

afuruhed · on Jan 20, 2013

Wire format - performance. Language describing the format - author friendliness compared to more verbose formats. We are going to release specs for an XML schema format and translations to and from the Blink format. The XML format can be used for mappings to and from "external" schemas. Should we do a JSON schema format?

hythloday · on Jan 20, 2013

In games, for example, it's pretty common for bandwidth to be tight enough that a useful optimisation is quantizing floats down to shorts (which is good enough for e.g. quaternion representation of weapon direction). In that environment, the idea of using XML or JSON is ludicrous.

(i.e. the Web is not the Internet)

vinkelhake · on Jan 20, 2013

This is by the guys who came up with FAST. That protocol is somewhat common in the dissemination of market data from exchanges. Pantor targets the financial sector and real-time trading systems in particular. It's a small company with smart people.

http://www.fixprotocol.org/fast

tlrobinson · on Jan 20, 2013

Suitability for FPGA implementation

That's an interesting requirement you don't see very often.

IvyMike · on Jan 20, 2013

High Frequency Trading applications, maybe?

yannis · on Jan 20, 2013

Go straight to the tutorial http://blinkprotocol.org/s/tutorial.html, easier to understand what it is all about.

georgeorwell · on Jan 20, 2013

I'm curious, why do you have a 3-clause BSD-style license on some PDF files that contain specifications? I guess they are a kind of binary built from the page layout source code, but I haven't really seen this before. I'm not sure if you can license the specification itself, i.e. put conditions on people that want to implement it.

http://blinkprotocol.org/s/specs.html

alanctgardner2 · on Jan 20, 2013

I'm not an expert, but you can definitely restrict the documents which describe the standard. Now, if someone sits there and reverse-engineers the spec without ever seeing that doc, that might be tough to prevent. But generally you can apply a license to any creative work you want; this includes technical documents. A good (or bad, depending on how you view it) example is service manuals for laptops.

randersson · on Jan 21, 2013

the license is essentially a statement that the Blink specification is and will be entirely free to use. There has been a lot of discussion re lock-in and we wanted to be clear about our intentions. Implementors are obviously free to choose their own license terms and we may find that some will prefer to use GPL or LGPL, rather than BSD for contributed code.

jws · on Jan 20, 2013

There is a lawsuit related to the FAST protocol origination. They are certainly aware of this. Best to be explicit.

nanodeath · on Jan 20, 2013

As someone who knows little about Protocol Buffers, my first question was how does this compare to Protocol Buffers...which they say blogging about is "on their todo list".

The protocol comes from Pantor Engineering, which apparently provides an "advanced trading system" out of Stockholm. Neat.

afuruhed · on Jan 20, 2013

Hi, I'm going to update the FAQ. We have started comparing to GPB, please see here.

http://blog.blinkprotocol.org/2013/01/blink-compared-to-goog...

As you can see in a comment by Rolf, part of the perf limitation of GPB is the implementation and not the wire protocol.

Anders

Inufu · on Jan 20, 2013

Is there a reason why all the fields for protobuf are declared as "optional", when it looks like they are in fact not optional at all?

bchallenor · on Jan 20, 2013

To support schema evolution. Of course it's up to you to write code that can actually handle missing fields, but once you have a required field in a message, you can never remove it.

From the docs:

"You should be very careful about marking fields as required. If at some point you wish to stop writing or sending a required field, it will be problematic to change the field to an optional field – old readers will consider messages without this field to be incomplete and may reject or drop them unintentionally. You should consider writing application-specific custom validation routines for your buffers instead. Some engineers at Google have come to the conclusion that using required does more harm than good; they prefer to use only optional and repeated. However, this view is not universal."

https://developers.google.com/protocol-buffers/docs/proto

randersson · on Jan 20, 2013

good question; I usually declare fields optional for the reason described in the reply above. I just re-ran the test with all fields declared as "required". It ran ~2% faster.

mingpan · on Jan 20, 2013

Feels kinda like a cross between ProtoBuf and MessagePack. We already have so many serialization formats, though. What is the compelling use-case for this one?

gdp · on Jan 20, 2013

Could be useful for some things. I feel like it's a missed opportunity to employ somethings like session types, given the existence of a quasi-typed human-readable "schema" definition. It seems like one could (and should!) go further and make it an unambiguous protocol specification against which implementing programs could eventually be checked.

afuruhed · on Jan 20, 2013

Good point. Is it possible to create an unambiguous protocol specification that scales to more complex interactions? If so, and if you know how, I guess you would need a different specification language to get expressiveness? Blink has an extensible schema but that won't cut it. Anyhow, we have so far not even tried to do what you want but we hope that there is nothing in Blink that puts limitations on doing that separately. There is a lot more that we don't try to do in the protocol core.

wavesounds · on Jan 20, 2013

Where does this fit on the network protocol stack? Does it replace TCP/UDP? Or is it on top of HTTP? Or is it not even a network protocol? I've skimmed the website but I'm surprised this isn't jumping right out at me tbh.

wavesounds · on Jan 20, 2013

"Blink can be used as an efficient presentation layer on top of middleware solutions like message queues and messaging platforms. However, Blink is also well-suited to be layered on TCP and UDP without any need for encapsulating headers, thanks to its built in message type system."

How is it TCP without the headers? And how can you put a layer on top of TCP and call that saving space? Shouldn't you just be using TCP? It's late I must be missing something ...

ekimekim · on Jan 20, 2013

It's a standard for encoding data over streams like TCP. If you recieve "00 4a 33 2b" over TCP, you have no idea what it means. But if you have a codebook or language defined to understand that data, you'll understand it means "Meet me on the first floor in half an hour". Blink is: - A meta-language for defining such languages - A specification of how messages in these languages are turned into binary data and back.

When both client and server know the definitions (written in the meta-language), they can determine which kind of message the other side is using because of the metadata sent with the message (their "message type system") without the need for any extra information, like a HTTP content-type header.

wrl · on Jan 20, 2013

This actually looks pretty cool. Disappointed to see that there's no reference implementation available as a library, however. Is there a plan to release one?

afuruhed · on Jan 20, 2013

The (maybe only) good thing about there being no reference implementation is that we get feedback on the protocol itself and not the characteristics of a particular implementation. In this first phase, we are providing tools to ease independent implementation. We think it would be a sign of health that the specs are (become) good enough to create interoperable implementations from. We will make our plans known as they develop. We are doing more with the core protocol next.

vinkelhake · on Jan 20, 2013

One thing I sorely missed when working with FAST was an official and comprehensive test suite. A wide set of templates and encoded files to exercise all the nooks and crannies of the spec would have been very useful.

It looks like blink has far fewer special cases so the need for an official test suite might not be as great. Might be something worth considering anyway.

randersson · on Jan 21, 2013

agreed, a test suite will be essential for implementors. I'd be very interested to hear you thoughts on this. Please mail me if you have interest. Rolf

dpakrk · on Jan 20, 2013

Does any binary message format (proto buffer , message pack , blink etc)support data streaming ? I as know none.

swalsh · on Jan 20, 2013

Why would I use this instead of ASN.1?

codex · on Jan 20, 2013

From the FAQ:

"We did not know how to make use of ASN.1 without making Blink a lot more complex."

This is the same reason Protocol Buffers exist: it was more fun to write something new than to understand how to use a better but more complex technology.

cbsmith · on Jan 20, 2013

I think you failed to grasp the point of that decision.

It's not about understanding the more complex technology. At least in the case of protocol buffers, they understood ASN.1. Implementing all that complexity comes with a price. If nothing else, it takes time away from other priorities.

If you look at protocol buffers, there are still ways to make the code faster, cleaner, etc., even without adding new features.

Having worked with a variety of ASN.1 libraries in a variety of ways, I can tell you that in almost every case, I found either performance bugs, functional bugs, or just missing features that literally got in my way.

andrewcooke · on Jan 20, 2013

also, the ASN.1 spec isn't free, which reduces adoption.

berlinbrown · on Jan 20, 2013

Interesting.

And it applies to anything?

dmitshur · on Jan 20, 2013

What can be said about this?