Libchan: Like Go channels over the network

rubiquity · on June 10, 2014

I'd like to know why to use this over ZeroMQ. Yes, ZeroMQ has its own protocol (which libchan's README points out), but I haven't found that protocol to get in the way, because in the end you're just sending text down the ZeroMQ socket and ZeroMQ's protocol is almost entirely "behind the scenes" so to speak.

I do understand why Docker would create this for their own use, because having a dependency on messaging libraries (where people tend to have varying opinions) wouldn't be the greatest for a wildly popular open source project. I'm just trying to figure out why others would want to use this.

I'm all for more tools in this space, though I like to make sure a tool exists for reasons other than "not invented here" before I learn yet another messaging library.

shykes · on June 10, 2014

Hey, we (Dotcloud now Docker) have used zeromq a LOT in the past. The entire Dotcloud platform is based on Zerorpc (http://github.com/dotcloud/zerorpc) which is based on ZMQ.

Libchan is the spiritual child of zerorpc, so it is definitely informed by our experience using zeromq. The main experience was: it's awesome but does too much, too low in the stack, and it's too much of a black box to remove the parts we don't need.

Also, the inability to safely expose it on the public internet was a big problem. I know this has been addressed since then, but we don't want to use a separate protocol stack for that: we want to use the existing HTTP (soon HTTP2) and TLS middleware infrastructure. That's why libchan uses spdy/tls as its main network transport.

scott_karana · on June 10, 2014

I think a more detailed writeup on the issues you had, and the rationale behind the choices for libchan, would be fascinating, if you ever have the chance. :-)

zobzu · on June 11, 2014

I agree. the summary still sounds like a NIH. Details would be awesome. blog post it, HN the blog post, you guys know how to do that I heard ;)

Thanks!

gioele · on June 10, 2014

> That's why libchan uses spdy/tls as its main network transport.

Just as forecast by many on HN, HTTP is becoming the network layer.

rakoo · on June 10, 2014

spdy+tls is not HTTP! They're building blocks on which you can write protocols that expect some interesting properties (auth, session, multiplexing, bidirectionality,...).

You must be referring to HTTP2, which is indeed HTTP1+SPDY+TLS in one single protocol. Actually, if SPDY catches on and is used for other uses (I already know muxado [0] that uses a SPDY-based protocol to multiplex requests/responses between client and servers) so that it lives by itself, it will be funny to see if HTTP2 really embeds it (and therefore duplicating the work).

[0] https://github.com/inconshreveable/muxado/

inconshreveable · on June 10, 2014

Apparently Armon built one too a few weeks ago [1]

I think I'd like to move muxado in a direction where it provides the common interface to the various stream multiplexers. That way you could code against a common interface and swap out http2/spdy/muxado/yamux as the implementation to suit your performance needs.

I'd also like to see libchan use this common interface as a layer built on top of this primitive.

[1] https://github.com/hashicorp/yamux

sneak · on June 10, 2014

> Also, the inability to safely expose it on the public internet was a big problem.

Deus, why? Why on Earth would parts of your critical internal infrastructure be exposed to the public Internet?!

shykes · on June 10, 2014

Who said anything about "internal"? We want every piece of our infrastructure to use a common, unified communication layer. If parts of it should be authenticated, tunneled, throttled or otherwise protected from nefarious outsiders, then we want to add the appropriate off-the-shelf middleware, or possibly write our own middleware. It makes no sense, in my mind, to artificially separate "internal" and "external" by implementing completely different protocol stacks and tools. In the end this hurts security because everything gets more complicated and you lose visibility.

Just my 2c based on our own production experience.

swdunlop · on June 10, 2014

The README looks interesting, but with zero API documentation, it is just a curiosity; see https://godoc.org/github.com/docker/libchan for how this package looks to GoDoc. Usage information and context is essential for encouraging other developers to adopt a dependency and start returning contributions.

shykes · on June 10, 2014

We introduced this at Dockercon today, along with a few other projects.

Here are the slides for the presentation: http://www.slideshare.net/shykes/docker-the-road-ahead

Note: the illustrator who helped me with this is @laurelcomics on Twitter, she is awesome and available for contracting :)

ww520 · on June 10, 2014

OT: Is there a PR campaign running by the Docker company now? There are awfully lots of Docker related news lately, not just the product release but all these side projects. It seems the side projects are announced coincided with the product release to prolong the buzz.

Just curious.

akerl_ · on June 10, 2014

They seem to be engaged in the best kind of PR: shipping lots of cool code :)

As the other responder pointed out, DockerCon is currently in progress, so most of the projects that have hit HN were just announced publicly as part of that. On a related note, I recommend folks keep an eye out for the videos of the talks. The Red Hat talk on geard/atomic and shykes's talk on the future of Docker (and these other projects) were especially great.

mtalantikite · on June 10, 2014

DockerCon has been going on: http://www.dockercon.com/

zobzu · on June 11, 2014

HN is a pretty good PR channel for such news. its easy to get in front too, get 10 ppl to upvote you and done you're front page material.

Its a little annoying when they spam it but meh, they work on stuff we need and its open source - could be worse.

swah · on June 10, 2014

Nice, I've read some about a netchan lib that existed in previous go versions. How is the problem w/ netchan described here https://groups.google.com/d/msg/Golang-Nuts/kK8XqkaaVuU/ZQdK... handled?

jerf · on June 10, 2014

Based on the code snippet above, by not trying to truly replicate the channel API. Which is a good thing, because it is simply too simple for something that may be a network communication. It's only trying to truly replicate the channel API over the network that causes problems; "channel-like" abstractions themselves have no particular complication.

kitd · on June 10, 2014

Back in the days of DCE, DCOM, RMI, etc, everyone used to say that "location transparency" was a myth.

Has that changed?

shykes · on June 10, 2014

This is actually the opposite pattern: instead of pretending everything is a local function even over the network (which turned out to be a bad idea), what if we did it the other way around? Pretend your components are communicating over a network even when they aren't. This is made possible by very efficient lightweight threads (goroutines in go, gevent in Python etc.) and message-oriented patterns popularized by Go channels (and Erlang/OTP before it).

Now your application always checks for IO errors, and the underlying "plumbing" is exposed and available for the developer to tweak at will: timeouts, caching, failover, fan-in, fan-out, etc. become programmable components just like the rest of your app.

TLDR: this is not born-again RPC. It's the anti-RPC.

stcredzero · on June 10, 2014

Pretend your components are communicating over a network even when they aren't. This is made possible by very efficient lightweight threads (goroutines in go, gevent in Python etc.) and message-oriented patterns popularized by Go channels (and Erlang/OTP before it).

It's good you mentioned Erlang. What you're describing >is< Erlang/OTP. The language is structured such that you're always somewhat paying a price for fault tolerant, concurrency safe, and parallelizable distribution. It turns out that this makes for a nifty functional programming language and, surprise surprise, the result is great for fault tolerant, concurrency safe, and parallelizable distribution.

tbrownaw · on June 10, 2014

Pretend your components are communicating over a network even when they aren't

Now your application always checks for IO errors, and the underlying "plumbing" is exposed and available for the developer to tweak at will: timeouts, caching, failover, fan-in, fan-out, etc

The big problem is that the bandwidth and latency numbers are vastly different over an actual network vs between processes or OS threads on a single machine vs between green threads within a process.

The problem is that sometimes you require high performance, to a degree that is not possible across an actual network link. And if your coding style doesn't distinguish between an actual network link and an imaginary in-process link (or worse, deliberately makes them indistinguishable and silently interchangeable), sooner or later someone will refactor it or change the config file or something and your microsecond-scale latency that you were assuming and relying on has suddenly become multiple-millisecond latency and everything grinds to a halt.

stcredzero · on June 10, 2014

And if your coding style doesn't distinguish between an actual network link and an imaginary in-process link...

If you start looking into the ways that parallelism can become highly inefficient on the current multicore architectures, you'll find that there is an inter-core/socket memory hierarchy/communications barrier which isn't well established in the mainstream consciousness of programmers. It turns out, that there is often far less of a difference between an actual network link and an imaginary in-process link than a naive programmer might believe there is, and the conditions which could cause this can result from a subtle interplay between multiple hardware and software mechanisms.

Here's one - http://en.wikipedia.org/wiki/False_sharing

(or worse, deliberately makes them indistinguishable and silently interchangeable)

Erlang/OTP actually makes this tradeoff, much to its advantage. In fact, the multicore pathologies I refer to above make the tradeoff more attractive.

sooner or later someone will refactor it or change the config file or something and your microsecond-scale latency that you were assuming and relying on has suddenly become multiple-millisecond latency and everything grinds to a halt.

What you describe is either a poorly managed shop or a poorly conceived programming environment. Either, the behavior of such a system should be one of the 7 or so things you must know to program such a system, or the environment should make it glaringly awkward to rely on something as a synchronous call.

nknighthb · on June 10, 2014

You're reciting an obfuscated tautology -- hard things are hard.

But most things are not hard. Hard things can be dealt with if and when they come up through documentation and training. Dealing with them by making easy things equally hard is silly.

stcredzero · on June 10, 2014

You're reciting an obfuscated tautology -- hard things are hard.

Try on this analogy: Moving in deep blizzard conditions is hard. Using vehicles with tracks and skids would make that easier, but that's obviously not a sensible and useful vehicle because it would suck for driving on dry highways.

But most things are not hard. Hard things can be dealt with if and when they come up through documentation and training. Dealing with them by making easy things equally hard is silly.

You could just as easily apply this reasoning to goto statements. Also to memory management. I'm not saying you don't have a point here -- I'm on board with "the right tool for the job" -- but your analysis could be a bit more nuanced.

nknighthb · on June 10, 2014

I don't think you understand what my reasoning is, because it would generally counsel against goto and manual memory management, which are harder tools for solving harder problems, and are usually unnecessary.

stcredzero · on June 11, 2014

I don't think you understand what my reasoning is

That you shouldn't restrict the use of certain tools/constructs/features in order to make areas like concurrency easier. Apparently you misunderstood my analogy. For one thing, it is an analogy, and not a statement you would use goto and manual memory management.

would generally counsel against goto and manual memory management, which are harder tools for solving harder problems,

That is a matter of scale. At small scales, "just using a goto" seems easier. It's only at larger scales that it becomes untenable spaghetti, so is harder. Herein is another analogy which can be related to concurrency and parallelism.

tbrownaw · on June 10, 2014

How am I saying that?

In-thread control transfers vs in-system IPC vs network links all have very different performance (bandwidth/latency) profiles.

The language should not encourage people to conflate them. They behave differently enough that making them indistinguishable is not a sane abstraction.

This does not mean that they need to have wildly different interfaces. Just like 'int' and 'double' don't have wildly different interfaces, but your code still needs to specify which it's using.

nknighthb · on June 10, 2014

If you're treating abstractions as black boxes, I can see how you would have a problem. Otherwise, I don't see one.

When performance matters, specify/document the performance characteristics of your modules and their deployment proximity requirements. This is hardly unusual, I do it all the time.

lclarkmichalek · on June 10, 2014

Essentially, instead of fighting against concurrency, by building big fancy black boxes of state, we now embrace concurrency, and let it permeate through the (green) fabric of the program.

skybrian · on June 10, 2014

This isn't actually true of Go channels, though. They don't have a way of reporting I/O errors (because no I/O), and they're often synchronous.

shykes · on June 11, 2014

You're totally right. That's why we call it "like" go channels, this is one of the differences. (another one is that you can't map arbitrary go types: there is only 1 return channel and fd allowed per message, at least for now).

stephanos2k · on June 10, 2014

I would love to see a small example/snippet illustrating how I'd use this.

shykes · on June 10, 2014

Here's how you would implement basic RPC-style request/response.

On the client:

    var ch libchan.Sender

    // Send a message, indicate that we want a return channel to be automatically created
    ret1, err := ch.Send(&libchan.Message{Data: []byte("request 1!"), Ret: libchan.RetPipe})

    // Send another message on the same channel
    ret2, err := ch.Send(&libchan.Message{Data: []byte("request 2!"), Ret: libchan.RetPipe})

    // Wait for an answer from the first request. Set flags to zero
    // to indicate we don't want a nested return channel
    msg, err := ret1.Receive(0)
    fmt.Printf("Received answer: %s\n", msg.Data)

On the server:

    var ch libchan.Receiver

    // Wait for messages in a loop
    // Set the return channel flag,
    // to indicate we want to receive nested channels (if any).
    // Note: we don't send a nested return channel, but we could.
    for {
        msg, err := ch.Receive(libchan.Ret)
        msg.Ret.Send(&libchan.Message{Data: []byte("this is an utterly useless response")})
    }

patrickaljord · on June 10, 2014

Here are Solomon's slides on the topic.

http://www.slideshare.net/shykes/docker-the-road-ahead

dmitshur · on June 10, 2014

I came here to say the same thing. Neither the README nor the examples folder had a very short, basic example. Thanks for the answer (below/above).

nkozyra · on June 10, 2014

I wonder if the authors are aware of Go'circuit:

https://github.com/gocircuit/circuit

baq · on June 10, 2014

"like Go channels over the network"

um, sockets anyone?

zimbatm · on June 10, 2014

It's possible to send a channel over a channel in go. This allows to for example to pass in the response channel of a message.

It's possible to pass in file descriptors on unix sockets but it only works locally.

shykes · on June 10, 2014

Note that, when using the Unix socket transport, libchan actually utilizes the ability to pass file descriptors.

shykes · on June 10, 2014

You can't send a tcp socket over a tcp socket :) (just one among many reasons why raw sockets are necessary but not sufficient).

tlrobinson · on June 10, 2014

How does this work? If A has a channel to B, and sends that channel to C, do B and C now communicate directly, or is it proxied through A?

EDIT: I see you mentioned the Unix socket transport passes around file descriptors. How about the inter-host transports? Obviously two browsers can't directly connect to each other (or could they with WebRTC...?) what about, say, TCP between hosts on the same local network?

mbreese · on June 10, 2014

And I suppose you can with a channel? Or are you just sending a proxy to the socket (which you could also do with a traditional socket and an appropriate protocol).

I'm not trying to diminish the importance of an easy to use library that abstracts away the socket-level code with a good protocol. I'm just saying that sending a socket over a socket isn't a good way to pitch it.

jvermillard · on June 10, 2014

each time you say raw socket I think about: http://man7.org/linux/man-pages/man7/raw.7.html and http://en.wikipedia.org/wiki/Raw_socket So you mean IP sockets without UDP or TCP?

shykes · on June 11, 2014

No, sorry, I mean regular TCP sockets without extra framing, as opposed to the spdy or websocket transport.

SixSigma · on June 11, 2014

You can on plan9, you can even send it over serial if you like

nkozyra · on June 10, 2014

That alone doesn't ensure control of a channel, though. Passing data through sockets is easy, controlling that transmission in terms of state and potential race conditions, is much harder.

iand675 · on June 10, 2014

While I can understand that the Docker team operates in a very Go-centric environment, it’s a shame that some of these libraries don’t export a C interface to enable other languages to use them.

That being said, it’s a neat project that I’m tempted to port the ideas from.

shykes · on June 11, 2014

We explicitly optimized for ease of portability, because Docker needs a solid implementation in every major language (so that we can use libchan as a transport for remote access to Docker, and for introspection by containers). Would you like to work with us on a C implementation ? :)

iand675 · on June 12, 2014

I'm not stellar with C, but I'd be happy to contribute!

shykes · on June 19, 2014

No need to be stellar :) Feel free to join #libswarm on Freenode, that's where the devs hang out!

tete · on June 11, 2014

Appears to be heavily inspired by Node.js Streams[1]. Would love this pattern to be copied to other languages aswell.

[1] https://github.com/substack/stream-handbook

stcredzero · on June 10, 2014

How about this? Make a new language that compiles to Javascript with Go like functionality, but instead of the "go" keyword to fire a coroutine, use something like "fore." Then enable single argument code blocks (lambdas) to be attached to a channel, also making it a function. Channels would have the same abbreviation as in Go. I think there is a lot one could do with these two features. Of course, one would have to name such a language after its two major features.

jvermillard · on June 11, 2014

clojurescript + core.async?

stcredzero · on June 11, 2014

Yes, but then you'd lose the pun.

lsllc · on June 10, 2014

Looks very interesting, I've been testing nsq vs 0mq; I like nsq's durability, but it's missing basic stuff such as a reply queue (you have to manage that yourself). 0mq seems ok, but encryption doesn't appear to be supported in the go client. Will take a good look at this. Thanks!

politician · on June 11, 2014

Here's an NSQ RPC server written in Erlang: https://github.com/project-fifo/ensq_rpc

I've been meaning to spend an afternoon porting it to Go...

mveety · on June 13, 2014

I wrote something similar to this a while back for Plan 9. Having channels over the network is really awesome, and it's the only thing I think that Go is badly missing. It would be cool if the author could have this make use of the normal Go syntax for message passing.

CMCDragonkai · on June 11, 2014

Is this designed for more higher level protocols to be built on top of? For example, could ZeroMQ, NanoMSG, WebSockets... rely on this?

Or is this meant to supplant all the existing protocols that we already have.

Would erlang have a use for this?

dragonwriter · on June 11, 2014

From the readme:

> An explicit goal of libchan is simplicity of implementation and clarity of spec. Porting it to any language should be as effortless as humanly possible.

Where is the spec?

gangster_dave · on June 10, 2014

And down goes erlang!