Our main hinderance with gRPC was that several disparate teams had strange issue...

kjeetgill · on May 27, 2019

That's always the problem with "batteries included". If they don't work it's often not worth the effort to fix them; you gotta toss em.

I'm curious what languages you were using gRPC with. The batteries includedness across tons of languages is a big part of gRPC's appeal. I'd assume Java and C++ get enough use to be solid but maybe that's wishful thinking?

ptoomey3 · on May 27, 2019

We were mostly using ruby (which uses their C bindings) and golang (which are native to golang).

cachvico · on May 28, 2019

What kind of problems did you run into, if you don't mind sharing?

ptoomey3 · on May 28, 2019

One that we encountered in several services were gRPC ruby clients that semi-regularly blocked on responses for an indeterminate amount of time. We added lots of tracing data on the client and server to instrument where “slowness” was occurring. We would see every trace span look just like you would hope until the message went into the runtime and failed to get passed up to the caller for some random long period of time. Debugging what was happening between the network response (fast) and the actual parsed response being handed to the caller (slow) was quite frustrating, as it requires trying to dig into C bindings/runtime from the ruby client.

mgsouth · on May 28, 2019

It was a couple of years ago, but the Go gRPC library had pretty broken flow control. gRPC depends upon both ends having an accurate picture of in-flight data volumes, both per-stream and per-transport (muxed connection). It's a rather complex protocol, and isn't rigorously specified for error cases. The main problem we encountered was that errors, especially timed-out transactions, would cause the gRPC library to lose track of buffer ownership (in the sense of host-to-host), and result in a permanent decrease of a transport's available in-flight capacity. Eventually it would hit zero and the two hosts would stop talking. Our solution was to patch-out the flow control (we already had app-level mechanisms).

[edit: The flow control is actually done at the HTTP/2 level. However, the Go gRPC library has its own implementation of HTTP/2.]

ptoomey3 · on May 29, 2019

Yet another batteries included downside...a blackbox http implementation that is hard to debug.

ptoomey3 · on May 28, 2019

This isn’t to say it happened on every response..it was a relatively small fraction. But, it was enough to tell _something_ was going on. Who knows, it could be something quirky on the network and not even a gRPC issue. But, because the runtime is so opaque, it made debugging quite difficult.

alasdair_ · on May 28, 2019

Even Java codefen has issues - like a single classfile so big it crashes any IDE not explicitly set up for it, or a whole bunch of useless methods that lead to autocomplete being terrible.