Cap'n Proto, FlatBuffers, and SBE

ipsin · on June 18, 2014

I love Cap'n Proto, but I wish it came with more advice on how to use it as part of a full solution. For example, if you're looking for service registration and discovery, authentication or encryption, the answer seems to be "do something else, then add Cap'n Proto".

I realize it probably sounds like I'm asking for hand-holding, but I'm curious how people are using it as part of this nutritious breakfast... er, I mean, back-end infrastructure.

kentonv · on June 18, 2014

It's true, Cap'n Proto is not a complete solution yet, and in particular there's a lot of work to be done on RPC. Currently RPC can operate over any arbitrary two-way byte stream you give it (e.g. a TCP connection), but if you want encryption then it's up to you to provide a stream that does encryption.

I'd like to build more of this into the system. In fact, it's probably a requirement for sandstorm.io (my main project). It'll take time, though.

FWIW, Cap'n Proto RPC is working great in Sandstorm as a way for sandboxed processes to communicate with the supervisor. This is entirely IPC so far, so encryption is not needed.

stouset · on June 19, 2014

It seems to me like Cap'n Proto is better off without any inherent encryption layer. Let the user be responsible for setting up a two-way stream with their own security requirements. There's no reason Cap'n Proto should duplicate the efforts of TLS.

kentonv · on June 20, 2014

Maybe. But Cap'n Proto implements a capability-based model which is more web-of-trust than PKI, so arguably a lot of what's in TLS (and a lot of what makes TLS complicated) is irrelevant to it. I agree that people who actually want traditional PKI should use TLS; we're certainly not going to reinvent that. But there's an opportunity to do something much simpler.

There's also the issue that Cap'n Proto wants to implement a distributed capability fabric in which machine A may have a capability to an object on machine B and may send that capability to machine C, after which machine C should form a direct connection to machine B in order to make calls on that capability. How to accomplish this (without PKI) is well-known, but it may prove convoluted to build it with TLS as the basic building block. I have to do more research, though.

uuilly · on June 19, 2014

This sort of thing is great for messages in real time systems or robotic systems where firing up a JSON parser / serializer just isn't an option. We use something similar in our system but I've had my eye on Cap'n Proto for a while...

kentonv · on June 18, 2014

I just added some corrections to this post (commit link at top of post). Sorry for the errors. Please let me know if you spot others.

tristanz · on June 18, 2014

It would be great if Cap'n Proto had better library implementations. This is one area where Protobuf, MsgPack, and, of course, JSON win.

dwrensha · on June 18, 2014

Cap'n Proto is relatively new, so yes, the tooling is less mature than for more established protocols. We've been making steady progress, though, and things are definitely usable right now. E.g. CloudFlare is reportedly using the Lua and Go implementations in production.

Perhaps you're interested in helping? We love new contributors, and I can personally attest that writing an implementation of Cap'n Proto is a great way to really get to know a language. I've been waiting for someone to jump on Swift...

_prometheus · on June 18, 2014

We should kickstart/bountysource more implementations. i'm sure plenty of hackers in-between jobs would take a few weeks to build these. Everyone would benefit!

VeejayRampay · on June 19, 2014

I really like the bit about benchmarks. That's the mature approach to software engineering that we need to see more often. Less comparisons and questioning about the "absolute best" but just providing pointers about the trade-offs each one is making. That really helps users decide what fits their needs best.

jzwinck · on June 19, 2014

Nice comparison. I'm of a different opinion on the schema language point: XML is readily usable with standard libraries, whereas any "custom" language is unlikely to be. This matters when users want to implement their own tooling that understands the schema directly.

Three years ago I built a similar no-serialization message system, and while people initially objected to XML as the schema language, it was the right choice (even vs. JSON, which some wanted but would have been worse, e.g. no comments).

XML has been overused, but I think it's also highly appropriate for a schema language.

kentonv · on June 19, 2014

Note that all four systems compared have the ability to compile schemas into self-hosted structures. E.g. Cap'n Proto has schema.capnp which defines a Cap'n Proto structure representing a Cap'n Proto schema. Code generator plugins actually take this structure as input, and the parser is available as a library as well. So you can actually write tools that programmatically read and manipulate these schemas pretty easily.

It's certainly a matter of opinion, but I find XML- and JSON-based schema languages extremely difficult to read with all the boilerplate they end up having.

ahupp · on June 19, 2014

The issue with XML-based schema languages is that they optimize for making the language designer's life easier at the expensive of its users. For anything with significant usage this is not a great tradeoff.

scriptproof · on June 19, 2014

For the Windows question, I must advise that it is not limited to Visual Studio or Cygwin. I use MingW daily and have a experience similar to that of GCC or Clang on Linux. CLang runs also on Windows but lacks a linker for now and uses that of MingW or VS.

jetp · on June 19, 2014

i'm doing a lot of json encoding/decoding in a web app. does capn proto or another help speeding up? it seems like it is not like one funcion call to use these.

kentonv · on June 19, 2014

For Cap'n Proto to help, you have to switch from using JSON format to using Cap'n Proto format, which is indeed a lot more than just one function call. But it should be orders of magnitude faster than JSON. So if profiles show you're spending a lot of time in JSON handling, then, yes, it could help.

Right now, Cap'n Proto makes the most sense for server <-> server communications, e.g. between a front-end and a back-end. In the future I hope to have a production-ready implementation in Javascript, which would allow you to use Cap'n Proto all the way up the stack.

_prometheus · on June 18, 2014

Thanks kentonv, super useful comparison :)

alexnewman · on June 18, 2014

I don't see how you ever get java working without jni. Which will be remarkably slow.

kentonv · on June 18, 2014

You use ByteBuffer, which has methods for reading numeric primitives of various sizes (int, long, float, double, ...) from specified offsets.

kasey_junk · on June 18, 2014

Or unsafe for non-portable solution.

theycallmemorty · on June 20, 2014

I would love to see Apache Thrift added to this comparison.

brunoqc · on June 19, 2014

Would it make sense to use Cap'n Proto with javascript instead of JSON?

kentonv · on June 19, 2014

Possibly. There's an implementation based on TypedArrays. I haven't benchmarked it against JSON. It's very possible that JSON would be faster in a browser simply by virtue of it being built-in. But you might choose to use Cap'n Proto in the browser anyway if your server infrastructure were based on it. It's nice to have the same format all the way down the stack. And there's generally plenty of CPU to spare on the client side.

All that said, I think Cap'n Proto in the browser will only really start making sense once there is a pure-Javascript RPC implementation over WebSocket.