Hacker News new | past | comments | ask | show | jobs | submit login
gRPC: The Bad Parts (kmcd.dev)
228 points by temp_praneshp 3 months ago | hide | past | favorite | 223 comments



All good points. But I'd argue that the single worst part of gRPC is the impenetrability of its ecosystem. And I think that, in turn, is born of complexity. The thing is so packed with features and behaviors and temporal coupling and whatnot that it's difficult to produce a compatible third-party implementation. That means that, in effect, the only true gRPC implementation is the one that Google maintains. Which, in turn, means that the only languages with good enough gRPC support to allow betting your business on it are the ones that Google supports.

And a lot of these features arguably have a poor cost/benefit tradeoff for anyone who isn't trying to solve Google problems. Or they introduce painful constraints such as not being consumable from client-side code in the browser.

I keep wishing for an alternative project that only specifies a simpler, more compatible, easier-to-grok subset of gRPC's feature set. There's almost zero overlap between the features that I love about gRPC, and the features that make it difficult to advocate for adopting it at work.


As someone in a very small company, no affiliation with any Google employees, gRPC and protobuf has been a godsend in many, many ways. My only complaint is that protoc is cumbersome af to use, and that is almost solved by buf.build. Except for our most used language, Java.

Protobufs has allowed us to version, build and deliver native language bindings for a multitude of languages and platforms in a tiny team for years now, without a single fault. We have the power to refactor and design our platform and apis in a way that we never had before. We love it.


You can get this without gRPC, though. Any IDL with codegen will do. My journey was that of gRPC fanatic to hardened skeptic. After one too many “we’re doing it this way because that’s what Google wants (and thus my promotion needs)” decisions from gRPC “maintainers”, I can’t stomach yielding so much control to a network request framework anymore. There’s something to be said for just making http requests using your platform/language native tooling. Is writing request boilerplate really your dev bottleneck?


Avoiding request boilerplate, in and of itself, is a benefit I could take or leave. Boilerplate request/response code is boring, but not actually time-consuming to write.

What I really like about the IDL and codegen is that it makes inter-team coordination easier. Comments in the .proto file is much nicer for documentation than, e.g., Swagger docs. And it gets even better when you're negotiating protocol changes among teams. You can just start a pull request with the proposed changes to the message format and documentation, which makes the whole process so much easier (and more accurate!) than what I've experienced with negotiating protocol and API changes for REST-style services.

And then, after it gets merged, the rollout is more likely to be successful. With REST, the risk of regression is greater. You've got to contend with ill- and implicitly-defined backward compatibility semantics that are inconsistent among different JSON and request library implementations, and you've got to contend with greater risk of each team's implementation implementing the message serde code in subtly incompatible ways. protobufs and gRPC don't eliminate that risk, but they greatly reduce it.

That said, yes, you're right, it does often feel like gRPC often steps over the line from being sensibly opinionated, to being a way for Google engineers I've never even met to micro-manage me. I wouldn't say I'm a hardened skeptic yet, but I'm definitely no longer a fanatic.


Maybe hardened skeptic is HN hyperbole. I tend to avoid adding it to every project I touch these days, though.


> Except for our most used language, Java.

The official Java implementation of grpc looks like abandonware. Out of the box the builder includes an annotation (javax.annotation.Generated) that was deprecated in 2019:

https://github.com/grpc/grpc-java/issues/9179

This gives me serious pause.


I don't think it's abandonware, per se, so much as that gRPC deliberately stays way behind the times with respect to Java language versions. So that they can support enterprise users who are conservative about Java upgrades. I don't know where the Java implementation is now, but when I was using it ca. 2019, it was still officially targeting Java 7. This was after the end of public updates, but still well within Oracle's paid support period for that version.

Java 7 support is now completely over, so I'm guessing now they're targeting Java 8. This does create an annoyance for Java 9 and later users, and requires annoying workarounds for them. I don't see this as a maintainability problem, so much as a culture clash: gRPC is doing things the Google way; other people don't want to do it the Google way.

All that said, I don't think other people are wrong. This is exactly the kind of thing I was complaining about - a lot of gRPC's potential (at least for my purposes) is undermined by its status as a big complicated enterprisey Google monoculture project.


> But I'd argue that the single worst part of gRPC is the impenetrability of its ecosystem

I have had the opposite experience. I visit exactly two repositories on GitHub, which seem to have the vast majority of the functionality I need.

> The thing is so packed with features and behaviors and temporal coupling and whatnot that it's difficult to produce a compatible third-party implementation.

Improbable did. But who cares? Why do we care about compatible third party implementations? The gRPC maintainers merge third party contributions. They care. Everyone should be working on one implementation.

> features arguably have a poor cost/benefit tradeoff for anyone who isn't trying to solve Google problems.

Maybe.

We need less toil, less energy spent reinventing half of Kubernetes and half of gRPC.


> Improbable did. But who cares? Why do we care about compatible third party implementations? The gRPC maintainers merge third party contributions. They care. Everyone should be working on one implementation.

Until they get fired by Google.


People complain about any system which is more complex and performant than plain ordinary JSON. Remember how Microsoft pushed "Web Services" and introduced AJAX where the last letter was supposed to be XML?

Microsoft could make the case that many many features in Web Services were essential to making them work but people figured out you could just exchange JSON documents in a half-baked way and... it works.


X in AJAX stands for XMLHttpRequest, which was predecessor of fetch API. It was originally used for XML in Outlook Web but wasn't tied with it. You can send any content type with it. Also it was for async web, I don't remember it had much relation to web services. Maybe SOAP, but that wasn't MS.

In case of gRPC I believe specs are tied to protobuf but I have seen thrift implementation also.


> I keep wishing for an alternative project that only specifies a simpler, more compatible, easier-to-grok subset of gRPC's feature set. There's almost zero overlap between the features that I love about gRPC, and the features that make it difficult to advocate for adopting it at work.

Perhaps connect: https://connectrpc.com/


It is possible to do quite well, as demonstrated by .NET.

Edit: and, if I remember correctly, gRPC tooling for it is maintained by about 1-3 people that are also responsible for other projects, like System.Text.Json. You don't need numbers to make something that is nice to use, quite often, it makes it more difficult even.


For the longest time (up until earlier this year[0]), you couldn't even get the proto details from a gRPC error. IME GP is correct, there are so many caveats to gRPC implementations that unless it is a prioritized, canonical implementation it will be missing things. It seems there are tons of gRPC libraries out there that fell behind or were missing features that you don't know until you need them (e.g. I recently had to do a big lift to implement HTTP proxy support in Tonic for a project).

0 - https://github.com/grpc/grpc-dotnet/issues/303


> unless it is a prioritized, canonical implementation it will be missing things

Perhaps. But for now, such canonical implementation (I assume you are referring to the Go one?) is painful to access and has all around really bad user experience. I'm simply pointing out that more ecosystems could learn from .NET's example (or just adopt it) rather than continuing to exist in their own bubble, and more engineers could have zero tolerance to the tooling with bad UX as it becomes more popular.

Now, with that said, I do not think gRPC is ultimately good, but I do think it's less bad than many other "language-agnostic" options - I've been burned badly enough by Avro, thanks. And, at least within .NET, accessing gRPC is even easier than OpenAPI-generated clients. If you care about overhead, Protobuf will always be worse than bespoke solutions like RKYV or MemoryPack.

I'm looking forward to solutions inspired by gRPC yet made specifically on top of HTTP/3's WebTransport, but that is something that is yet to come.


No disrespect to the other developers on the team, but James Newton-King is no ordinary developer.


.NET looks quite good, as well as Swift actually. I have most experience with Java, and those are almost as nice as the .NET bindings. I have also used Go quite a bit, and they are pretty much awful. It takes so much practice and knowledge to use them well in Go.


>It is possible to do quite well, as demonstrated by {one of the other largest companies in the world}

FTFY


Does drpc meet your needs?

https://storj.github.io/drpc/


I'm surprised the author doesn't mention ConnectRPC: https://connectrpc.com/

It solves ALL the problems of vanilla gRPC, and it even is compatible with the gRPC clients! It grew out of Twirp protocol, which I liked so much I made a C++ implementation: https://github.com/Cyberax/twirp-cpp

But ConnectRPC guys went further, and they built a complete infrastructure for RPC. Including a package manager (buf.build), integration with observability ( https://connectrpc.com/docs/go/observability/ ).

And most importantly, they also provide a library to do rich validation (mandatory fields, field limits, formats, etc): https://buf.build/bufbuild/protovalidate

Oh, and for the unlimited message problem, you really need to use streaming. gRPC supports it, as does ConnectRPC.


Author here: I definitely should have been more explicit for my love of connectrpc, buf, protovalidate, etc. I do mention ConnectRPC but maybe not as loud as I could have. I definitely try to avoid confessing my love for ConnectRPC in every post, but sometimes it's hard not to because they've made such good strategic decisions that just make sense and round out the ecosystem so well.


My fault, I did read the article but I overlooked the ConnectRPC mention.


The author does mention it in the article and is also a contributor to supporting tooling

https://github.com/sudorandom/protoc-gen-connect-openapi


The tooling around gRPC with bazel when using python is so bad it’s almost impossible to work with, which is hilarious considering they both come from Google. Then I had additional problems getting it to work with Ruby. Then I had more problems getting it to work in k8s, because of load balancing with http/2. Combine those issues with a number of other problems I ran into with gRPC, and I ended up just building a small JSON-RPC implementation that fit our needs perfectly.


Another point of view is, don't use Bazel. In my experience, Gradle is less of a headache and well supported.


Gradle is an anti-pattern. Just stick with Maven and live a happy life.

As someone who's used Gradle tons, 6 months ago I wrote in detail about why not gradle:

https://news.ycombinator.com/item?id=38875936

Gradle still might less bad than Bazel, though.


I'm firmly in the Maven is better than Gradle camp. Yes, it's less flexible, that's why I like it.

Gradle might be better if it wasn't a poorly documented Groovy/Kotlin DSL where everything is a closure, but I do like the fact that if you want to do something in Maven you need a plugin that couples to known points in the lifecycle. It makes it explicit what is doing what and where.

And fully agree on the incredible pain of Gradle upgrades.


Maven is one of my favorite package managers. It's not as fancy as npm, cargo, go, etc., but it works consistently well and I never have to fight it.


For python? Gradle doesn't really support python.


I mean, gradle is a generic build tool. It could support it, just as it can be used to compile C (I have done the latter).


At Google scale, I’m sure excruciatingly horrible builds are no worry because they’re some other team’s problem. I hope JSON-RPC eats the world.


No, it's still horrible, we just don't understand why because some other team maintains it. We just suffer the long build times by killing time on memegen or visiting the MK.


We use the textproto format extensively at work; it's super nice to be able to define tests of your APIs by using .textpb inputs and golden .textpb outputs. We have so many more tests using this method than if we manually called the APIs by in a programming language for each test case, and I wouldn't want to use JSON for test inputs, since it lacks comments.


If you use intellij, you can annotate your text protos with some header comments and unlock schema validation, completion, etc.


Author here: Sorry that I was so harsh on textproto. You are right that it has some strengths over JSON... I'm actually a fan of JSONC for this reason. It does limit you on tooling... but so does textproto, right?

I think the bigger thing that I'm worried about is that gRPC has so many mandatory features that it can become hard to make a good implementation in new languages. To be honest there are some languages where the gRPC implementation is just not great and I blame the feature bloat... and I think textproto was a good demonstration of that feature bloat to me.


Yeah, the feature bloat makes it a hurdle to make some good quality implementations. I mostly stayed in Java for years. They are quite good. grpc-web is ok I guess, protobuf-ts are great, swift came along as nice, then I have always been saddened by how awful they are in Go. You will get terrible enums, terrible one-ofs, the interceptors and service registration are very awkward.


"Adds a build step" is just not a thing you notice in any way if you also use Bazel, which I imagine Google imagines everyone doing. I don't really agree with any of the complaints in this article since they are sort of vague and apparently from the point of view of browsers, whereas I am a backend developer, but I think there is a large universe of things to complain about with gRPC. First and foremost, it seems as though bazel, protobuf C++, and gRPC C++ developers have never spoken to each other and are apparently not aware that it is almost impossible to build and link a gRPC C++ server with bazel. The bazel people have made it impossible to build protobuf with bazel 7 and the protobuf people have made it impossble to use bazel with protobuf 27, while the gRPC people and the rules_proto people are not even in the conversation. The other complaint from the C++ side is that the implementation is now so slow that Go and Java beat it easily, making C++ people look stupid at work.


The last time I attempted to use GRPC++ it was pretty hard to build even without the heaping monstrosity that is Bazel.


My biggest complaint with gRPC is proto3 making all nested type fields optional while making primitives always present with default values. gRPC is contract based so it makes no sense to me that you can't require a field. This is especially painful from an ergonomics viewpoint. You have to null check every field with a non primitive type.


If I remember correctly the initial version allowed required fields but it caused all sorts of problems when trying to migrate protos because a new required fields breaks all consumers almost by definition. So updating protos in isolation becomes tricky.

The problem went away with all optional fields so it was decided the headache wasn't worth it.


I used to work at a company that used proto2 as part of a homegrown RPC system that predated gRPC. Our coding standards strongly discouraged making anything but key fields required, for exactly this reason. They were just too much of a maintenance burden in the long run.

I suspect that not having nullable fields, though, is just a case of letting an implementation detail, keeping the message representation compatible with C structs in the core implementation, bleed into the high-level interface. That design decision is just dripping with "C++ programmers getting twitchy about performance concerns" vibes.


Zoox?


According to the blog post of one of the guys who worked on proto3: the complexity around versioning and required fields was exacerbated because Google also has “middle boxes” that will read the protos and forward them on. Having a contract change between 2 services is fine, required fields are probably fine, have random middle boxes really makes everything worse for no discernible benefit.


proto2 allowed both required fields and optional fields, and there were pros and cons to using both, but both were workable options.

Then proto3 went and implemented a hybrid that was the worst of both worlds. They made all fields optional, but eliminated the introspection that let a receiver know if a field had been populated by the sender. Instead they silently populated missing fields with a hardcoded default that could be a perfectly meaningful value for that field. This effectively made all fields required for the sender, but without the framework support to catch when fields were accidentally not populated. And it made it necessary to add custom signaling between the sender and receiver to indicate message versions or other mechanisms so the receiver could determine which fields the sender was expected to have actually populated.


You can now detect field presence in several cases: https://protobuf.dev/programming-guides/field_presence/#pres...

This is very solid with message types, but for basic types you can add `optional field` if needed as well (essentially making the value nullable)


proto2 required fields were basically unusable for some technical reason I forget, to the point where they're banned where I work, so you had to make everything optional, when in many cases it was unnecessary. Leading to a lot of null vs 0 mistakes.

Proto3 made primitives all non-optional, default 0. But messages were all still optional, so you could always wrap primitives that really needed to be optional. Then they added optionals for primitives too recently, implemented internally as wrapper messages, which imo was long overdue.


The technical reason is that they break the end-to-end principle. In computer networking in general, it is a desirable property for an opaque message to remain opaque, and agnostic to every server and system that it passes through. This is part of what makes TCP/IP so powerful: you can send an IP packet through Ethernet or Token Ring or Wi-Fi or carrier pigeon, and it doesn't matter, it's just a bunch of bytes that get passed through by the network topology.

Protobufs generally respect this property, but required fields break it. If you have a field that is marked 'required' and then a message omitting it passes through a server with that schema, the whole message will fail to parse. Even if the schema definition on both the sender and the recipient omits the field entirely.

Consider what happens when you add a new required field to a protobuf message, which might be embedded deep in a hierarchy, and then send it through a network of heterogenous binaries. You make the change in your source repository and expect that everything in your distributed system will instantly reflect that reality. However, binaries get pushed at different times. The sender may not have picked up the new schema, and so doesn't know to populate it. The recipient may not have picked up the new schema, and so doesn't expect to read it. Some message-bus middleware did pick up the new schema, and so the containing message which embeds your new field fails to parse, and the middleware binary crashes with an assertion failure, bringing down a lot of unrelated services too.


If you want to precisely capture when a field must be present (or will always be set on the server), the field_behavior annotation captures more of the nuance than proto2's required fields: https://github.com/googleapis/googleapis/blob/master/google/...

You could (e.g.) annotate all key fields as IDENTIFIERs. Client code can assume those will always be set in server responses, but are optional when making an RPC request to create that resource.

(This may just work in theory, though – I’m not sure which code generators have good support for field_behavior.)


That decision seems practical (especially at Google scale).

I think the main problem with it, is that you cannot distinguish if the field has the default value or just wasn't set (which is just error prone).

However, there are solutions to this, that add very little overhead to the code and to message size (see e.g. [1]).

[1]: https://protobuf.dev/programming-guides/dos-donts/


But you can distinguish between default and unset: all optional fields have has_ method associated with them: https://protobuf.dev/reference/cpp/cpp-generated/#fields


The choice to make 'unset' indistinguishable from 'default value' is such an absurdly boneheaded decision, and it boggles my mind that real software engineers allowed proto3 to go out that way.

I don't get what part of your link I'm supposed to be looking at as a solution to that issue? I wasn't aware of a good solution except to have careful application logic looking for sentinel values? (which is garbage)


Yes, proto3 as released was garbage, but they later made it possible to get most proto2 behaviors via configuration.

Re: your question, for proto3 an field that's declared as "optional" will allow distinguishing between set to default vs. not set, while non-"optional" fields don't.


Ah, ok! Yeah I think we've been working with an older version of protobuf for a while where that wasn't an option.


Yet still in 2024, supporting optional is off by default for some languages in protoc...


RPC/proto is for transport.

Required/validation is for application.


If that's true, why have types in proto at all? Shouldn't everything be an "any" type at the transport layer, and the application can validate the types are correct?


Different types can and do use different encodings


It was a little weird at first, but if you just read the 1.5 pages of why protobuf decided to work like this it made perfect sense to me. It will seem over complicated though to web developers who are used to being able to update all their clients at a whim at any moment.


proto3 added support for optional primitives sorta recently. I've always been happy without them personally, but it was enough of a shock for people used to proto2 that it made sense to add them back.


Just out of curiosity, what domain were you working in where "0.0" and "no opinion" were _always_ the same thing? The lack of optionals has infuriated me for years and I just can't form a mental model of how anybody ever found this acceptable to work with.


Like nearly every time, empty string and 0 for integers can be treated the same as "no value" if you think about it. Are you sending data or sending opinions? Usually to force a choice, you would make a enum or a one-of where the zero-value means the client has forgotten to set it and it can be modelled as a api error. Whether the value was actually on the wire or not is not really that important.


0 as a default for "no int" is tolerable, 0.0 as a default for "no float" is an absolute nightmare in any domain remotely related to math, machine learning, or data science.

We dealt with a bug that for weeks was silently corrupting the results of trials pitting the performance of various algos against each other. Because a valid response was "no reply/opt out", combined with a bug in processing the "opt out" enum, also combined with a bug in score aggregation, functions were treated like they replied "0.0" instead of "confidence = None".

It really should have defaulted NaN for missing floats.


What about floats made this a problem? We're treating 0.0 specially in some places.


I think your anecdote is rather weak with regards to the way protobuf works, but to entertain, why would a confidence of 0.0 be so different from None? 0.0 sounds very close to None for most numerical purposes if you ask me.

Wait, are you using Python?


message LatLon {

    double lat = 1; // Geodetic latitude, decimal degrees

    double lon = 2; // Geodetic longitude, decimal degrees
}

"Hmm, lat = 0, I guess they didn't fill out the message. I'll thrown an exception and handle it as an api error"

[Later, somewhere near the equator]

"?!?!??!"

------------------

"Ok, we learned our lesson from what happened to our customers in Sao Tome and Principe: 0.0 is a perfectly valid latitude to have. No more testing for 0, we'll just trust the value we parse.

[Later, in Norway, when a bug causes a client to skip filling out the LatLon message]

"Why is it flying to Africa?!?!"

------------------

Ok, after the backlash from our equatorial customers and the disaster in Norway, we've learned our lesson. We will now use a new message that lets us handle 0's, but checks that they really meant it:

message LatLonEnforced {

    optional double lat = 1;

    optional double lon = 2;
}

[At some third party dev's desk] "Oh, latitude is optional - I'll just supply longitude"

[...]

"It's throwing exceptions? But your schema says it's optional!"

------------------

Ok, it took some blood sweat and tears but we finally got this message definition licked:

message LatLonEnforced {

    optional double lat = 1; // REQUIRED. Geodetic latitude, decimal degrees

    optional double lon = 2; // REQUIRED. Geodetic longitude, decimal degrees
}

[Later, in an MR from a new hire] "Correct LatLon doc strings to reflect optional semantics"


If both lat and lon are required, you don't need to throw an exception for lat=0. If you want lat=null lon=0.0 to mean something like "latitude is unknown but longitude is known to be 0.0," yeah you need optional or wrapped primitives.

Edit: If a client doesn't fill out the LatLng message, that's different from lat and/or lon being null or 0. The whole LatLng message itself will be null. Proto3 always supported that too. But it's usually overkill to check the message being null, unless you added it later and need to specially deal with clients predating that change. If the client just has a bug preventing it from filling out LatLng, that's the client's problem.

The confusing part here is that even if the LatLng message is null, LatLng.lat will return 0 in some proto libraries, depending on the language. You have to specifically check if the message is null if you care. But there are enough cases where you have tons of nested protos and the 0-default behavior is actually way more convenient.


Yeah - I think what I'm getting at though is that you want to guard against situations where somebody accidentally doesn't set one of the fields, and yet 0 is a valid value to set on that field. You could accidentally forget to fill in latitude, and that would be bad news.

Def +1 the confusingness of how you have to explicitly check for has_latlon() to make sure you're not just getting default values of a non-existent LatLon message. The asymmetry between primitive and message-type fields in having explicit presence checking is also part of my beef. It's weird to have to teach junior devs that you can do has_* checks, but only for messages.


It's safer in a way to guard against these situations, but it seems like they don't intend you to do that often because there are more downsides outweighing it. Proto2 had the "required" feature that had its own issues. Our team trusts to some degree that the user actually read the API spec, and so far it's been ok.

I can imagine message nullness being clearer in languages with optional unwrapping like JS. Like foo.bar.baz gives an error if bar is null, and if you want default-0 you use foo.bar?.baz. Idk if that's what happens though.


In the case for Lat/Lon, I guess that 0.0 could have a meaning, though it is very unlikely someone is exactly at lat/lon 0.0. An alternative is to translate to the XY coordinate system, though that is not a perfect solution either.

If you really feel like expressing that LatLon as possibly null, it should rather be:

message User {

  optional LatLon position = 1;

}


working with gRPC allowed me to understand how go(lang)'s use of things like sql.nullstring works (pseudo'ish code}:

  astring = sql.nullstring{isNull=true, value=""}

  if astring.isNull {don't do anything}
  else {process astring.value}
So similarly, gRPC has a method called HasField, so:

  if packet.HasField(astring) {then process packet.astring}
Is it wordy? Yes. Is it elegant? Sadly, No. But does it work? Yes.


Yes it was python but that has nothing to do with it. Same would happen in go, rust, R, or matlab.

Correct answers: 1.0, 0.0, 1.0

Confidence from algo: 1.0, 0.0, n/a

Confidence on the wire: 1.0, 0.0, 0.0

Score after bug: 66%

Score as it ought to be scored: 100%

It was enough to make several algorithms which were very selective in the data they would attempt to analyze (think jpg vs png images) went from "kinda crap" in the rankings to "really good"


Well, only in python is that N/A value also a float. In protobuf, go or Java for that matter, that data model must somehow be changed to communicate the difference.

If you had use 3 float values in Go or Java you would have had the same problem.


Yeah, I think it's best to first rethink of null as just, not 0 and not any other number. What that means depends on the context.

Tangent: I've seen an antipattern of using enums to describe the "type" of some polymorphic object instead of just using a oneof with type-specific fields. Which gets even worse if you later decide something fits multiple types.


I love oneofs in the language with good support, but they are woeful in Golang and the "official" grpc-web message types.


Not really one domain in particular, just internal services in general. In many cases, the field is intended to be required in the first place. If not, surprisingly 0 isn't a real answer a lot of the time, so it means none or default: any kind of numeric ID that starts at 1 (like ticket numbers), TTLs, error codes, enums (0 is always UNKNOWN). Similarly with empty strings.

I have a hard time thinking of places I really need both 0 and none. The only example that comes to mind is building room numbers, in some room search message where we wanted null to mean wildcard. In those cases, it's not hard to wrap it in a message. Probably the best argument for optionals is when you have a lot of boolean request options where null means default, but even then I prefer instead naming them such that the defaults are all false, since that's clearer anyway.

It did take some getting used to and caused some downstream changes to how we design APIs. I think we're better for it because the whole null vs 0 thing can be tedious and error-prone, but it's very opinionated.


Any time you have a scalar/measurement number, basically any value with physical units, counts, percentages, anything which could be in the denominator of a ratio, those are all strong indicators of a "semantic zero" and you really want to tell the difference between None and 0. They are usually floats, but could be ints (maybe you have number_of_widgets_online, 0 means 0 units, None means "idk".)


What's the difference between none inches and 0 inches? Might need a concrete example. We deal with space a fair amount and haven't needed many optionals there.


They just gave a concrete example: The difference between "we don't know" and "we know that it's zero".

Here's another fun one: I've seen APIs where "0.0" was treated as "no value, so take the default value". The default value happened to be 0.2.


I had the same experience. It was a bit awkward for 6 months, but down the line we learned to design better apis, and dealing with nullable values are tedious at best. Its just easier knowing that a string or integer will _never_ cause a nullpointer.


Huh, yeah I see. I guess I work more on the robotics side where messages often contain physical or geometric quantities, and that colors my thinking a bit. So "distance to that thing = 0" is a very possible situation, and yet you also want to allow it to say "I didn't measure distance to that thing". And those are very distinct concepts you never want to conflate.


I can see that. Or the rare situations where I needed 0 vs null, if for some reason that situation was multiplied times 100, I'd start wanting an optional keyword.


Same, I rarely felt a need for this distinction.


Re. bad tooling. grpcurl[1] is irreplaceable when working with gRPC APIs. It allows you to make requests even if you don't have the .proto around.

[1]: https://github.com/fullstorydev/grpcurl


> make requests even if you don't have the .proto around

Like this?

    > grpcurl -d '{"id": 1234, "tags": ["foo","bar"]}' \
        grpc.server.com:443 my.custom.server.Service/Method
How is that even possible? How could grpcurl know how to translate your request to binary?


If I recall correctly, ProtoBuf has a reflection layer, and it's probably using that.


I could be wrong, but it is probably using json encoding for the object body, and implementing the transport for grpc instead of http. Proto objects support json encode/decode by default in all the implementations I've seen.

https://grpc.io/blog/grpc-with-json/


One can use Kreya for a GUI version


I just build a cli in Java or Go. It literally takes minutes to build a client.


I remember being surprised at how hard it was to read the source code for grpc Java. There's an incredible amount of indirection at every turn.

This made it extremely hard to get answers to questions that were undocumented.

It's a shame because I know Google can put out easy to read code (see: the go standard library).


> It's a shame because I know Google can put out easy to read code (see: the go standard library).

My guess is that the difference is that go is managed by a small group of engineers that have strong opinions, really care about it, and they have reached "fuck you level", so they can prioritize what they think is important instead of what would look good on a promo packet.


Sass was the first code generator I ever met that produced decent output. I’d been coding for 15 years at that point. Less is the second one.

That’s the end of this story. There are basically two people in the world who have demonstrated that they can be trusted to generate code that isn’t an impenetrable quagmire of pain and confusion.

I doubt it’s an accident that both emitted declarative output rather than imperative, but I would be happy to be proven wrong.


Some of the details inside protobuf in Java can be very convoluted, but they are also the result of intense benchmarking, years of experiences and a long tail with deep legacy support for old java.

Honestly I found the Java bindings to be way better designed and thought out than Golang. On a consumer level, the immutable message builders are fantastic, the one-ofs are decent compared to what Java can offer, and the service bindings actually provide a beautiful abstraction with their 0-1-many model. In Golang, if you only have to deal with Unary rpc they are OK I guess, but I really miss the immutable messages.


To be clear, I'm not talking about generated code or anything touching protobuf serde. Just grpc-the-library. Interceptors, retry policies, channels, etc.


The generated C++ interfaces to gRPC are also filled with an incredible amount of indirection and unnecessary concepts. I'd say it's a "bad at writing complex things simply" culture rather than being Java-specific.


Autogenerated code in general tends to be unreadable. It's not easy and/or not a priority.


I think it's partly a culture thing. Java developers love indirection, and they're used to an ecosystem that doesn't want to be understood. An ecosystem that wants you to google whatever obtuse error message it decides to spit out, and paste whatever half thought out annotation some blog post spits back, into your code to make it work.

I've worked with people who considered anything that wasn't programmed with annotations to be "too advanced" for their use-case.


Java is on life support as a language, but the ecosystem is strong, that's why it has all these weird features via annotations. And people who use Java are just trying to get stuff done like everyone else.


How is it on life support, when it’s by far the biggest server-side language, running basically every top companies’ business critical infrastructure?

Also, annotations are just metaprogramming, which can be tastefully applied.


Like I said, the ecosystem is strong. The language's design hasn't aged well, so nowadays any Java code I see in prod has 1-4 annotations above every class and method to get around the limitations of the language. Similar to how some C code will rely heavily on macros.


That’s not due to the language, but due to the business domain (I assume web development). In this domain almost every framework, regardless of language, will heavily use metaprogramming, see django, etc.


Javascript and Golang didn't need metaprogramming for this. Some of this has to do with not adhering to the OOP-everywhere model. Where's the metaprogramming in Django?


Java is doing quite well.


What an idiotic comment.


Yes, all of it.

Google claims gRPC with protobuf yields a 10-11x performance improvement over HTTP. I am skeptical of those numbers because really it comes down to the frequency of data parsing into and out of the protobuf format.

At any rate just use JSON with WebSockets. Its stupid simple and still 7-8x faster than HTTP with far less administrative overhead than either HTTP or gRPC.


> JSON with WebSockets. Its stupid simple and still 7-8x faster than HTTP with far less administrative overhead than either HTTP or gRPC.

Everyone doing what you are saying ends up reinventing parts of gRPC, on top of reinventing parts of RabbitMQ. It isn't ever "stupid simple." There are ways to build the things you need in a tightly coupled and elegant way, but what people want is Next.js, that's the coupling they care about, and it doesn't have a message broker (neither does gRPC), and it isn't a proxy (which introduce a bajillion little problems into WebSockets), and WebSockets lifetimes don't correspond to session lifetimes, so you have to reinvent that too, and...


but what people want is Next.js,

What people? Developers? This is why I will not do that work anymore. Don't assume to know what I want based upon some tool set or tech stack that you find favorable. I hate (HATE HATE) talk of tech stacks, the fantasy of the developer who cannot write original software, who does not measure things, and cannot provide their own test automation. They scream their stupidity for all the world to hear when they start crying about reinventing wheels, or some other empty cliche, instead of just delivering a solution.

What I want is two things:

1. Speed. This is not an assumption of speed. Its the result of various measurements in different execution contexts.

2. Less effort. I want to send a message across a network... and done. In this case you have some form of instruction or data package and then you literally just write that to the socket. That is literally 2 primitive instructions without abstractions like ex: socket.write(JSON.parse(thing));. It is without round trips, without headers, without anything else. You are just done.


> parts of

The counterpoint to the fact that gRPC and RabbitMQ handle whatever you're writing better than you do is that gRPC and RabbitMQ have immense amounts of complexity that you have to deal with despite the fact that you don't care about it


> At any rate just use JSON with WebSockets. Its stupid simple and still 7-8x faster than HTTP with far less administrative overhead than either HTTP or gRPC.

gRPC is not supposed to be a standard web communication layer.

There are times where you need a binary format and extremely fast serialization/deserialization. Video games are one example where binary formats are greatly preferred over JSON.

But I do agree that people keep trying to shove gRPC (or similar) into things where they aren't needed.


> gRPC is not supposed to be a standard web communication layer.

It kind of is. What do you think WebTransport in HTTP/3 is? It's basically gRPC Next. The only reason gRPC didn't make it as the standard web communication layer is because of one disastrous decision by one Chrome engineer in https://issues.chromium.org/issues/40388906, maybe because he woke up on the wrong side of the bed.


Can you expand on this a little? I could not work out the decision from the Chrome issue you linked to.


I think this blog post provides the context for that chromium discussion:

https://carlmastrangelo.com/blog/why-does-grpc-insist-on-tra...

(Somewhere in the middle of the article)


gRPC is meant for backend microservices among other things, and it's still painful for that for the reasons the article describes, all of which could've been fixed or avoided. Internally Google doesn't even use gRPC, they use something similar that has better internal support according to https://news.ycombinator.com/item?id=17690676

I also don't see what'd stop it from being used generally for websites calling the backend API. Even if you don't care about the efficiency (which is likely), it'd be nice to get API definitions built in instead of having to set up OpenAPI.


> Internally Google doesn't even use gRPC, they use something similar that has better internal support according to https://news.ycombinator.com/item?id=17690676

But that says that they do use gRPC internally on projects that are new enough to have been able to adopt it?


In 2018, there was some kind of push towards gRPC internally, but it was since abandoned and reversed among the few who actually switched. They still don't use it internally, only externally in some places.


So, wrong link?


That's great, but protobufs is slow as shit. I wouldnt use it in games.

If I was using something slow that needed flexibility I'd probably go with Avro since it has more powerful scheme evolution.

If I wanted fast I'd probably use SBE or Flatbuffers (although FB is also slow to serialise)


Depending on the use case, it's often better to just copy structs directly, with maybe some care for endianness (little-endian). But at this point, the two most popular platforms, ARM and x86, agree on endianness and most alignment.

There's almost no reason why RPC should not just be

  send(sk, (void *)&mystruct, sizeof(struct mystructtype), 0)


Do all of your platforms have the same word width? The same -fshort-enums settings? Do you know that none of your data structures include pointers? Do all of your systems use the same compiler? Compiler version?

I agree it will usually work, but this becomes an ABI concern, and it's surprisingly common to have ABI mismatches on one platform with the items I've noted above.


I've seen wire protocols that had junk for the alignment buffering in such structs. And I've seen people have to do a whole lot of work to make the wire protocol work on a newer compiler/platform. Also, the whole point of a network protocol being documented is that it decouples the interface (msgs over a network) from the implementation (parsing and acting on msgs). Your v1 server might be able to just copy the read buffer into a struct, but your v2 server won't. And it is possible and desirable to live in a world where you can change your implementation but leave your interface alone (although some parts of the software ecosystem seem to not know this nice fact and implicitly fight against realizing it).

My issue with gRPC is simple, the Go gRPC server code does a lot of allocations. I have a gRPC service where each container does 50-80K/second of incoming calls and I spend a ton of time in GC and in allocating headers for all the msgs. I have a similar REST service where I use fasthttp with 0 allocs (but all the stupidly high number of connections due to the lack of multiplexing thru the connection).


Go's GC wasn't really made with throughput maximization in mind. It's a language that doesn't scale that well to take advantage of beefy nodes and has weak compiler. I suppose the Google's vision for it is to "crank the replica count up". gRPC servers based on top of ASP.NET Core, Java Vert.X and Rust Thruster will provide you with much higher throughput on multi-core nodes.

https://github.com/LesnyRumcajs/grpc_bench/discussions/441


Ignoring the incompatibilities in word size, endianness, etc, how does a Go or JavaScript or etc program on the receiving end know what `mystruct` is? What if you want to send string, list, map, etc data?


string, list, map, etc? You have to use an encoding scheme.

As for go / javascript? I think most languages have the ability to inspect a raw buffer.


> string, list, map, etc? You have to use an encoding scheme.

Yes, you have to use an encoding scheme like JSON or Protobufs. Dumping memory directly down the pipe as you're suggesting doesn't work.

> As for go / javascript? I think most languages have the ability to inspect a raw buffer.

No language has the ability to read a raw buffer and know what the contents are supposed to mean. There needs to be a protocol for decoding the data, for example JSON or Protobufs.


Won't work if your struct has any pointers in it.


I'd recommend not doing that then. Of course the same is true if you coerce a pointer to an int64 and store it in a protobuf.


It's not the pointers themselves so much as what they're typically used for. How would you do dynamic sizing? Imagine sending just a struct of integer arrays this way, you'd have to either know their sizes ahead of time or just be ok with sending a lot of empty bits up to some max size. And recursive structures would be impossible.

You could get around this with a ton of effort around serdes, but it'd amount to reinventing ASN1 or Protobuf.


A lot of protocols in low latency trading systems just have fixed maximum size strings and will right pad with NUL or ASCII space characters.

Packed structs with fixed size fields, little endian integers and fixed point is heaven to work with.


I can see that in niche situations, particularly if you have a flat structure and uniform hardware. Cap'n Proto is also a way to do zero-parsing, but it has other costs.


Protobuf can typically be about 70-80% smaller than the equivalent JSON payloads. If you care about Network I/O costs (at a large scale), you'd probably want to realize a benefit in cost savings like that.

Additionally, I think people put a lot of trust into JSON parsers across ecosystems "just working", and I think that's something more people should look into (it's worse than you think): https://seriot.ch/projects/parsing_json.html


Let's say I wanted to transfer a movie in MKV container format. Its binary and large at about 4gb. Would I use JSON for that? No. Would I use gRPC/protobuf for that? No.

I would open a dedicated TCP socket and a file system stream. I would then pipe the file system stream to the network socket. No matter what you still have to deal with packet assembly because if you are using TLS you have small packets (max size varies by TLS revision). If you are using WebSockets you have control frames and continuity frames and frame head assembly. Even with that administrative overhead its still a fast and simple approach.

When it comes to application instructions, data from some data store, any kind of primitive data types, and so forth I would continue to use JSON over WebSockets.


I agree, there's a lot to gain from getting away from JSON, but gRPC needs HTTP/1 support and better tooling to make that happen.


You probably want to check this out: https://connectrpc.com/


Thanks. I've got a little project that needs to use protobufs, and if my DIY approach of sending either application/octet-stream or application/json turns out to be too sketchy, I'll give Connect a try. Only reason I'm not jumping for it is it involves more dependencies.


To get feature parity, you still need an IDL to generate types/classes for multiple languages. You could use JSON Schema for that.

Websockets do not follow a request/reply semantics, so you'd have to write that yourself. I'd prefer not to write my own RPC protocol on top of websockets. That said, I'm sure there are some off the shelf frameworks out there, but do they have the same cross-language compatibility as protobuf + gRPC? I don't think "just use JSON with websockets" is such a simple suggestion.

Of course, gRPC does have some of its own problems. The in-browser support is not great (non-existent without a compatibility layer?) last time I checked.


> Google claims gRPC with protobuf yields a 10-11x performance improvement over HTTP.

That... doesn't make any sense, since gRPC is layered on top of HTTP. There must be missing context here.


gRPC was based on a HTTP draft that predated the standardization of HTTP/2, so presumably that statement was said about HTTP/1. HTTP/2 may not have existed at the time it was asserted.


gRPC gives you multiplexing slow request over 1 TCP connection, which reduces all the work and memory related to 1 socket per pending request; gRPC means you don't have to put the string name of a field into the wire, which makes your messages smaller which puts less stress into the memory system and the network, assuming your field values are roughly as large as your field names.


Multiplexing is a HTTP/2 feature. But as gRPC was based on an early HTTP/2 draft, it beat HTTP/2 to the punch. Thus it is likely that HTTP/2 didn't exist at the time the statement was made and therefore HTTP/1 would have been implied.


Must be comparing to the equivalent JSON-over-HTTP usage.


I dont even care about the performance. I just want some way to version my messages that is backward and forwards compatible and can be delivered in all the languages we use in production. I have tried to consume json over websockets before and its always a hassle with the evolution of the data format. Just version it in protobuf and push the bytes over websocket if you have a choice. Also, load balancing web socket services can be a bitch. Just rolling out our web socket service would disconnect 500k clients in 60 seconds if we didnt make huge amounts of work.


> because really it comes down to the frequency of data parsing into and out of the protobuf format.

Protobuf is intentionally designed to NOT require any parsing at all. Data is serialized over the wire (or stored on disk) in the same format/byte order that it is stored in memory

(Yes, that also means that it's not validated at runtime)

Or are you referencing the code we all invariably write before/after protobuf to translate into a more useful format?


You’re likely thinking of Cap’n’Proto or flatbuffers. Protobuf definitely requires parsing. Zero values can be omitted on the wire so there’s not a fixed layout, meaning you can’t seek to a field. In order to find a fields value, you must traverse the entire message, and decode each tag number since the last tag wins.


> Data is serialized over the wire (or stored on disk) in the same format/byte order that it is stored in memory

That's just not true. You can read about the wire format over here, and AFAIK no mainstream language stores things in memory like this: https://protobuf.dev/programming-guides/encoding

I've had to debug protobuf messages, which is not fun at all, and it's absolutely parsed.


> Protobuf is intentionally designed to NOT require any parsing at all.

As others have mentioned, this is simply not the case, and the VARINT encoding is a trivial counterexample.

It is this required decoding/parsing that (largely) distinguishes protobuf from Google's flatbuffers:

https://github.com/google/flatbuffers

https://flatbuffers.dev/

Cap'n Proto (developed by Kenton Varda, the former Google engineer who, while at Google, re-wrote/refactored Google's protobuf to later open source it as the library we all know today) is another example of zero-copy (de)serialization.


> Protobuf is intentionally designed to NOT require any parsing at all

This is not true at all. If you have a language-specific class codegen'd by protoc then the in-memory representation of that object is absolutely not the same as the serialized representation. For example:

1. Integer values are varint encoded in the wire format but obviously not in the in-memory format

2. This depends on the language of course but variable length fields are stored inline in the wire format (and length-prefixed) while the in-memory representation will typically use some heap-allocated type (so the in-memory representation has a pointer in that field instead of the data stored inline)


Its pretty ironic but Microsoft decided to lean into gRPC support for C#/ASP.NET and its honestly really well done and has great devx.


Why is this ironic?


I just meant that one of the better implementations is in a language Google doesn't heavily use. Maybe it's not ironic and just a refreshing example of OSS at work.


Google ate their lunch with Chrome, so long ago.


What does that have to do with grpc support in C#?


My main problems with grpc are threefold:

- The implementation quality and practices vary a lot. The python library lacks features that the go library has because they are philosophically opposed to them. Protobuf/grpc version pinning between my dependencies has broken repeatedly for me.

- If you are a services team, your consumers inherit a lot of difficult dependencies. Any normal json api does not do this, with openapi the team can use codegen or not.

- The people who have been most hype to me in person about grpc repeat things like "It's just C structs on the wire" which is completely fucking wrong, or that protobuf is smaller than json which is a more situational benefit. My point being their "opinion" is uninformed and band-wagoning.

This article gave me some new options for dunking on grpc if it's recommended.


I had to chuckle when I read the "Bad Tooling" section, because anyone that has had to deal with COM and DCOM, is painfully aware how much better the development experience with gRPC happens to be, and is incredible how bad the COM/DCOM tooling still is after 30 years, given its key role as Windows API, specially since Vista.

Not even basic syntax highlighting for IDL files in Visual Studio, but nice goodies for doing gRPC are available in Visual Studio.


> nice goodies for doing gRPC are available in Visual Studio.

Could you elaborate on this? (Heavy grpc/C# usage here and we just edit the protos)


Imagine that instead of what you do with gRPC/C#, you had to edit proto files just like using Notepad (so is the COM IDL editing experience in VS), and instead of having VS take care of the C# code generation, you either call the MIDL CLI compiler yourself, or manually integrate it on some build step, only to open the folder of generated code, and then manually merge it with your existing code inside of Visual Studio.

That is the gold experience for doing COM in C++, doing COM in C# is somehow better, but still you won't get rid of dealing with IDL files, specially now that TLB support is no longer available for .NET Core.

Quite tragic for such key technology, meanwhile C++ Builder offers a much more developer friendly experience.


My biggest issue with GRPC is direct mapping of ip addresses in the config or at runtime. From the docs: "When sending a gRPC request, the client must determine the IP address of the service name." https://grpc.io/docs/guides/custom-name-resolution/

My preferred approach would be to map my client to a "topic" and then any number of servers can subscribe to the topic. Completely decoupled, scaling up is much easier.

My second biggest issue is proto file versioning.

I'm using NATS for cross-service comms and its great. just wish it had a low-level serialization mechanism for more efficient transfer like grpc.



I don't understand why there isn't an HTTP/1 mode for gRPC. Would cover the super common use case of client-to-server calls. Give people who already have your typical JSON-over-HTTP API something that's the same except more efficient and with a nicer API spec.

You know what's ironic, Google AppEngine doesn't support HTTP/2. Actually a lot of platforms don't.


The streaming message transfer modes are the main thing that make it difficult.


Streaming seems like it'd work without too much effort. It'd be less efficient for sure, but it's also not a very common use case.


In general, I agree. But my understanding is that the problem isn't streaming in the abstract, it's supporting certain details of the streaming protocol outlined in the gRPC spec.


FWIW I never worked at Google and I used protobuf / gRPC extensively at work and in nearly all of my side projects. Personally, I think overall it's great. I do wish trailers were an optional feature though.


A lot of this kind of criticism rubs me the wrong way, especially complaining about having to use words or maths concepts, or having to learn new things. That's often not really a statement on the inherent virtue of a tool, and more of a statement on the familiarity of the author.

I don't want to sound flippant, but if you don't want to learn new things, don't use new tools :D


Sending a request and getting a response back is not a new concept, it's about as old as computer networks in general, and gRPC is the only framework that refers to this concept as "unary". This is the original argument from the article and I tend to agree with it


Monads and functors are nothing new either, but that doesn’t mean giving them that name was a bad idea.

Moreover, the term “unary” is used to distinguish from other, non-unary options: https://grpc.io/docs/what-is-grpc/core-concepts/


> I don't want to sound flippant, but if you don't want to learn new things, don't use new tools :D

That's precisely the problem. The author wants to convince people (e.g., his colleagues) to use a new tool, but he has to convince them to learn a bunch of new things including a bunch of new things that aren't even necessary.


IME gRPC is almost never the right balance of tradeoffs. There are (much) better tools for defining web APIs that web apps can actually use without a proxy, JSON encoding/decoding is easy to get to be real fast, and language support varies from great (Go, C++) to hmm (Java, Python). Debugging is painful, extra build steps and toolchains are annoying and flaky, dependencies are annoying, etc etc. 99% of people should probably just be using OpenAPI, and the other 1% should probably just use MessagePack.


A lot of tooling badness comes out of the fact that gRPC integration in its lingua franca, Go, requires manual wiring of protoc.

I don't know why or how there isn't a one-liner option there, because my experience with using gRPC in C# has been vastly better:

    dotnet add package Grpc.Tools // note below
    <Protobuf Include="my_contracs.proto" />
and you have the client and server boilerplate (client - give it url and it's ready for use, server - inherit from base class and implement call handlers as appropriate) - it is all handled behind the scenes by protoc integration that plugs into msbuild, and the end user rarely has to deal with its internals directly unless someone abused definitions in .proto to work as a weird DSL for end to end testing environment and got carried away with namespacing too much (which makes protoc plugins die for most languages so it's not that common of occurrence). The package readme is easy to follow too: https://github.com/grpc/grpc/blob/master/src/csharp/BUILD-IN...

Note: usually you need Grpc.Client and Google.Protobuf too but that's two `dotnet add package`s away.


The Go tooling for gRPC is inexplicably bad, both in terms of ergonomics and in terms of performance.

The GoGoProtobuf [1] project was started to improve both. It would generate nice Go types that followed Go's conventions. And it uses fast binary serialization without needing to resort to reflection.

Unfortunately, the gRPC/Protobuf team(s) at Google is famously resistant to changes, and was unwilling to work with the GoGo. As a result, the GoGo project is now dead. [2]

I've never used Buf, but it looks like it might fix most of the issues with the Go support.

[1] https://github.com/gogo/protobuf

[2] https://x.com/awalterschulze/status/1584553056100057088


Similar experiences with web services via WCF. It was in dealing with anything published that wasn't .Net where it got difficult. PHP services were not complaint with their own WSDL, similar for internal types in Java from some systems. It was often a mess compared to the C# experience, hence everyone moving towards REST or simpler documentation that was easy to one-off as needed, or use an API client.


One of Go's goals is no arbitrary code execution during during compiles, so it will ~never pull in any code generation tools and run them for you.


Insisting a particularly exotic flavor of HTTP(2) is its most severe design flaw, I think. Especially, as it could have worked in an agnostic manner, e.g. on top WebSockets.


Author here: it's nerdy web trivia but HTTP trailers are actually in the HTTP/1.1 spec, although very few browsers, load balancers, programming languages, etc. implemented it at the time since it wasn't super useful for the web. You are definitely correct that it is an exotic feature that often gets forgotten about.


Something I didn’t see listed was the lack of a package manager for protos.

For example if I want to import some common set of structs into my protos, there isn’t a standardized or wide spread way to do this. Historically I have had to resort to either copying the structs over or importing multiple protoc generated modules in my code (not in my protos).

If there was a ‘go get’ or ‘pip install’ equivalent for protos, that would be immensely useful; for me and my colleagues at least.


https://buf.build/ is this, no?


Thanks for sharing! Yes things like this would help solve our problems.


It is mentioned under the "Bad tooling" section


Oh my mistake, must have missed that.


One of my favourite bits is having to pass a json string to the python library to configure a service. To this day I am not entirely sure it is adhering to the config


same in java


I am surprised no-one is mentioning Buf for all the great work they've done with the CLI and Connect for much better devex, tooling, and interoperability.


The worst part of all is that most people don’t need gRPC, but use it anyway. It’s a net addition of complexity and you’re very likely not getting the actual benefits. I’ve seen countless simple REST APIs built with language-native tooling burned to the ground to be replaced with layers of gRPC trash that requires learning multiple new tools and DSLs, is harder to troubleshoot and debug, and ultimately tends to force API rigidity far sooner than is healthy.

One project I worked on was basically just a system for sharing a JSON document to multiple other systems. This was at a golang shop on AWS. We could have used an S3 bucket. But sure, an API might be nice so you can add a custom auth layer or add server side filters and queries down the road. So we built a REST API in a couple of weeks.

But then the tech lead felt bad that we hadn’t used gRPC like the cool kids on other teams. What if we needed a Python client so we could build a Ansible plugin to call the API?? (I mean, Ansible plugins can be in any language; it’s a rest API, Ansible already supports calling that (or you could just use curl); or you could write the necessary Python to call the REST API in like three lines of code.) so we spent months converting to gRPC, except we needed to use the Connect library because it’s cooler, except it turns out it doesn’t support GET calls, and no one else at the company was using it.

By the time we built the entire service, we had spent months, it was impossible to troubleshoot, just calling the API for testing required all sorts of harnesses and mocks, no good CLI tooling, and we were generating a huge Python library to support the Ansible use case, but it turned out that wasn’t going to work for other reasons.

Eventually everyone on that team left the company or moved to other projects. I don’t think anything came of it all but we probably cost the company a million dollars. Go gRPC!


> The worst part of all is that most people don’t need gRPC, but use it anyway. It’s a net addition of complexity and you’re very likely not getting the actual benefits. I’ve seen countless simple REST APIs built with language-native tooling burned to the ground to be replaced with layers of gRPC trash that requires learning multiple new tools and DSLs, is harder to troubleshoot and debug, and ultimately tends to force API rigidity far sooner than is healthy.

This sounds odd to me because I don't really see how gRPC would cause any of those issues?

> layers of gRPC trash

What layers? Switching from REST (presumably JSON over http) to gRPC shouldn't introduce any new "layers". It's replacing one style of API call with a different one.

> learning multiple new tools and DSLs

New tools sure, you need protoc or buf to build the bindings from the IDL, but what is the new DSL you need to learn?

> ultimately tends to force API rigidity far sooner than is healthy

How does gRPC force API rigidity? It is specifically designed to be evolvable (sometimes to its usability detriment IMO)

There are some definite footguns with gRPC and I am becoming increasingly annoyed with Protobuf in particular as the years go on, but going back to REST APIs still seems like a huge step backwards to me. With gRPC you get a workflow that starts with a well-defined interface and all the language bindings client/server stubs are generated from that with almost zero effort. You can kind of/sort of do that with REST APIs using openapi specs but in my experience it just doesn't work that well and language support is sorely lacking.


> What layers? Switching from REST (presumably JSON over http) to gRPC shouldn't introduce any new "layers".

Of course it does, starting with the protobufs and code generation. You say yourself in your very next reply:

"New tools sure, you need protoc or buf to build the bindings from the IDL, but what is the new DSL you need to learn?"

And the DSL is presumably protobuf, which you yourself are "increasingly annoyed" with.


You need all the same stuff with a REST API only instead of using tooling to codegen all the boilerplate you have to write it by hand (or use janky OpenAPI code generators which, in my experience, rarely work very well).

I am increasingly annoyed by protobuf as a standalone format but given the choice to create a new API using gRPC (where I can spend five minutes writing some proto files and then codegen all the boilerplate I need for both server and client in any mainstream language) and creating it as a REST API where I have to manually code all the boilerplate and decide between a zillion different ways of doing everything I will choose gRPC 100% of the time.


> You need all the same stuff with a REST API

That's just not true. A straightforward REST API is significantly simpler and less code throughout.


How exactly? If we take the simplest possible "hello world" service, then protoc generates all the code for a gRPC service without you having to manually type anything


People use it - like I do - because they like the improved type safety compared to REST. We use gRPC at $dayjob and I would hate going back to the stringly typed mess that is JSON over REST or the _really_ absurdly over engineered complexity trap that is GraphQL. gRPC lets us build type safe, self-documented internal APIs easily and with tooling like Buf, most of the pain is hidden.

The DSL I consider a plus. If you build REST APIs you will usually also resort to using a DSL to define your APIs, at least if you want to easily generate clients. But in this case the DSL is OpenAPI, which is an error prone mess of YAML or JSON specifications.


I use it because: 1. I am not writing a network API without a solid spec, and 2. I want to decouple the number of tcp connections from the amount of pending work. I don't want one wonky msg to consume many resources and I want a spike of traffic to cause more msgs to be sent to the worker pools not cause a bunch of TCP connection establishment, SSL handshakes, etc. I also find it personally offensive to send field names in each network msg, as per JSON or XML.


This, 100%. I am never going back to stringly typed JSON in whatever random url structure that team felt like doing that week. GraphQL is made for Facebook type graph problems. Its way overcomplicated for most use cases. I just want a lingua franca DSL to enforce my API specification in a consistent manner. I dont care if its PUT POST PATCH. Just keep it easy to automate tooling.


> People use it - like I do - because they like the improved type safety compared to REST.

You don't need a binary format just to get type safety. JSONSchema, OpenAPI, etc exist after all.

> But in this case the DSL is OpenAPI, which is an error prone mess of YAML or JSON specifications.

They might not be pretty, but they're not particularly error prone (the specs themselves are statically checked).


YAML in any form is error prone and hard to write. Protobuf - for all its warts - is much easier to write And much more type safe.

But let’s just agree to disagree here. You do you and build REST APIs, while I’ll stick to gRPC


I’m not anti-protobuf; it’s just overkill for type safety. But yeah, use what you want.


This anecdote highlights scope creep and mismanagement, not a fault of gRPC.


I think the anecdote highlights that there's no incremental way to approach gRPC, it's not a low risk small footprint prototype project that can be introduced slowly and integrated with existing systems and environments. Which, well, it is a bit of a fault of gRPC.


I think that's not true. There are plenty of incremental ways to adopt gRPC. For example, there are packages that can facade/match your existing REST APIs[1][2].

Reads like a skill issue to me.

[1]: https://github.com/grpc-ecosystem/grpc-gateway [2]: https://github.com/connectrpc/vanguard-go


While one can't refute the existence of the mentioned warts, they are not a big concern practically. We use gRPC in our Partner SDK[0] and Connector SDK[1].

[0] https://fivetran.com/docs/partner-built-program [1] https://fivetran.com/docs/connectors/connector-sdk


Story time: the whole development of protobuf was... a mess. It was developed and used internally at Google long before it was ever open sourced.

Protobuf was designed first and foremost for C++. This makes sense. All of Google's core services are in C++. Yes there's Java (and now Go and to some extent Python). I know. But protobuf was and is a C++-first framework. It's why you have features like arena allocation [1].

Internally there was protobuf v1. I don't know a lot about this because it was mostly gone by the time I started at Google. protobuf v2 was (and, I imagine, still is) the dominant form of.

Now, this isn't to be confused with the API version, which is a completely different thing. You would specify this in BUILD files and it was a complete nightmare because it largely wasn't interchangeable. The big difference is with java_api_version = 1 or 2. Java API v1 was built like the java.util.Date class. Mutable objects with setters and getters. v2 changed this to the builder pattern.

At the time (this may have changed) you couldn't build the artifacts for both API versions and you'd often want to reuse key protobuf definitions that other people owned so you ended up having to use v1 API because some deep protobuf hadn't been migrated (and probably never would be). It got worse because sometimes you'd have one dependency on v1 and another on v2 so you ended up just using bytes fields because that's all you could do. This part was a total mess.

What you know as gRPC was really protobuf v3 and it was designed largely for Cloud (IIRC). It's been some years so again, this may have changed, but there was never any intent to migrate protobuf v2 to v3. There was no clear path to do that. So any protobuf v3 usage in Google was really just for external use.

I explain this because gRPC fails the dogfood test. It's lacking things because Google internally doesn't use it.

So why was this done? I don't know the specifics but I believe it came down to licensing. While protobuf v2 was open sourced the RPC component (internally called "Stubby") never was. I believe it was a licensing issue with some dependency but it's been awhile and honestly I never looked into the specifics. I just remember hearing that it couldn't be done.

So when you read about things like poor JSON support (per this article), it starts to make sense. Google doesn't internally use JSON as a transport format. Protobuf is, first and foremost, a wire format for C++-cetnric APIs (in Stubby). Yes, it was used in storage too (eg Bigtable).

Protobuf in Javascriipt was a particularly horrendous Frankenstein. Obviously Javascript doesn't support binary formats like protobuf. You have to use JSON. And the JSON bridges to protobuf were all uniquely awful for different reasons. My "favorite" was pblite, which used a JSON array indexed by the protobuf tag number. With large protobufs with a lot of optional fields you ended up with messages like:

    [null,null,null,null,...(700 more nulls)...,null,{/*my message*/}]
GWT (for Java) couldn't compile Java API protobufs for various reasons so had to use a variant as well. It was just a mess. All for "consistency" of using the same message format everywhere.

[1]: https://protobuf.dev/reference/cpp/arenas/


There was never any licensing issue, do you think Google would depend on third party software for anything as core as the RPC system? The issue was simply that at the time, there was a culture in which "open sourcing" things was being used as an excuse to rewrite them. The official excuse was that everything depended on everything else, but that wasn't really the case. Open sourcing Stubby could certainly have been done. You just open source the dependencies too, refactor to make some optional if you really need to. But rewriting things is fun, yes? Nobody in management cared enough to push back on this, and at some point it became just the way things were done.

So, protobuf1 which was perfectly serviceable wasn't open sourced, it was rewritten into proto2. In that case migration did happen, and some fundamental improvements were made (e.g. proto1 didn't differentiate between byte arrays and strings), but as you say, migration was extremely tough and many aspects were arguably not improvements at all. Java codebases drastically over-use the builder/immutable object pattern IMO.

And then Stubby wasn't open sourced, it was rewritten as gRPC which is "Stubby inspired" but without the really good parts that made Stubby awesome, IMO. gRPC is a shadow of its parent so no surprise no migration ever happened.

And then Borg wasn't open sourced, it was rewritten as Kubernetes which is "Borg inspired" but without the really good part that make Borg awesome, IMO. Etc.

There's definitely a theme there. I think only Blaze/Bazel is core infrastructure in which the open source version is actually genuinely the same codebase. I guess there must be others, just not coming to mind right now.

Using the same format everywhere was definitely a good idea though. Maybe the JS implementations weren't great, but the consistency of the infrastructure and feature set of Stubby was a huge help to me back in the days when I was an SRE being on-call for a wide range of services. Stubby servers/clients are still the most insanely debuggable and runnable system I ever came across, by far, and my experience is now a decade out of date so goodness knows what it must be like these days. At one point I was able to end a multi-day logs service outage, just using the built-in diagnostics and introspection tools that every Google service came with by default.


What are those “really good parts” that Stubby and Borg have but their open source versions don’t?


I'll admit it's been a while since I looked at gRPC / Kubernetes and I never used them in anger for real projects. It's possible that some of these claims are wrong or can be filled with plugins, have been fixed in newer releases, etc. Also everything here about Google's stuff is a decade+ out of date. It might all be different now.

One thing that I really miss from other RPC systems is the variety of debug endpoints. Stubby piggybacked on HTTP a bit like gRPC does by registering endpoints into a pre-existing HTTP server. One was a magic hidden endpoint that just converted the socket into a Stubby socket, but others let you do things like:

• Send an RPC by filling out an auto-generated HTML form, so you could also use curl to send RPCs for debugging purposes. There is an OpenAPI based thing that gives you something similar for REST these days, but it's somehow heavier and not quite as clean.

• View all RPCs that were in-flight, including cross-process traces, how often the RPCs had retried etc. This made it very easy to figure out where an RPC had got stuck even if it had crossed several machines. In the open world there's Jaeger and similar, I haven't tried those, but this was built in and didn't require any additional tools.

• View latency histograms of RPCs, connected machines, etc. View the stack traces of all the threads.

• They had a global service discovery system that was basically a form of reactive DNS, i.e. you could subscribe to names and receive push notifications when the job got moved between different underlying machines.

• Endpoints for changing the values of flags/parameters on the fly (there were thousands exposed like this).

• RPC routing was integrated with the global load balancing system.

Probably a dozen more things I forgot.

All this made it very easy to explore and diagnose systems using just a web browser, and you didn't face problems of finding servers that didn't have these features because every app was required to use the in-house stack and all the needed features were enabled by default. Whereas in most open source server stacks the authors are obsessed with plugins, so out of the box they do very little and companies face an uphill battle to ensure everything is consistent.

For clusters the main difference I remember is that Borg had a proper config language instead of the weird mashed up YAML templating thing Kubernetes uses, and the Borg GUI was a lot cleaner and more info-dense than the Material Design thing that Kubernetes had, and the whole reactive naming system was deeply integrated in a natural way. Also Kubernetes is all about Docker containers, which introduces some complexity that Borg didn't have. I had problems in the past with k8s/docker doing dumb things like running out of disk space because containers weren't being purged at the right times, and kernel namespaces have also yielded some surprises. At the time Borg didn't really use namespacing, just chroots.

There are some minor stylistic differences. The old Google internal UI had a simple industrial feel. It was HTML written by systems engineers so everything was very simple, info dense, a few blocks of pastel colors here and there. Imagine the Linux kernel guys making web pages. Meaning: very fast, lightweight, easy to scrape if necessary.


Yeah my memory of this (which is admittedly fuzzy) is that a bunch of business things happened around 2012-2015, which led to these external software libraries / products that are "arguably inferior rewrites" and "not what Google actually uses"

I think 2012 is when Larry became CEO (again), and 2015 is when the "Alphabet" re-org / re-naming happened.

1. Larry Page was generally unhappy with the direction and execution of the company, so he became CEO. (Schmidt would never be CEO again)

2. VP Bill Coughran was shown the door (my interpretation, which is kind of like Eric Schmidt being shown the door). For my entire time there he had managed the software systems -- basically everything in google3, or everything important there

3. Urs Hoezle took over everything in technical infrastructure. I think he had previously been focused on hardware platforms and maybe SRE; now he was in charge of software too.

Urs sorta combined this "rewrite google3" thing with the "cloud" thing. To me there was always a tenuous connection there, at least technically. I can see why it made sense from a business perspective

---

Basically Larry was unhappy with google3 because the company wasn't shipping fast enough, e.g. compared to Facebook. It was perceived as mired in technical debt and processes (which IMO was essentially true, and maybe inevitable given how fast the company had grown for ~8 years)

And I think they were also looking over their shoulders at AWS, which I think by then had become "clearly important".

Why don't we have an AWS thing? At some point GCE was kind of a small project in Seattle, and then it became more important when AWS became big.

Anyone remember when Urs declared that google3 was deprecated and everything was going to be written on top of cloud in 12 to 18 months? (what he said was perhaps open to interpretation -- I think he purposely said something really ambitious to get everyone fired up)

So there was this shift to "externalize" infrastructure, make it a real product. Not just have internal customers, but external ones too.

---

So I think what you said is accurate, and I think that is the business context where the "arguably inferior rewrites" came from

- Kubernetes is worse in many ways than Borg [1]

- gRPC (I haven't used it) is apparently worse in many ways than Stubby, etc.

I'd be interested if anyone has different memories ...

---

[1] although I spent some time reading Borg source code, and e.g. compared to say the D storage server, which was also running on every node, it was in bad shape, and inefficient. There are probably ways that K8s is better, etc.

My main beef is the unimaginable complexity of running K8s on top of GCE on top of Borg -- i.e. 3 control planes stacked on top of each other ...


> do you think Google would depend on third party software for anything as core as the RPC system?

I don't believe Google has (had?) any objections to using open source or open sourcing things but you have to remember two things:

1. Most companies weaponize open source. They use it to "commoditize their product's complements" [1]; and

2. Google3 are so deeply integrated in a way that you can't really separate some of the tech because of the dependencies on other tech. More on that below.

> Open sourcing Stubby could certainly have been done. You just open source the dependencies too

Yeah, I don't think it's always that simple. You may not own the rights to something to be able to open source it. Releasing something may trigger more viral licenses (ie GPL) to force you to open source things you don't want to or can't.

I actually went through the process of trying to import a few open source packages into Google's third party repo and there are a lot of "no nos". Like a project had to have a definite license (that was white listed by legal). Some projects liked to do silly things like having a license like "do whatever you want" or "this is public domain". That's not how public domain works BTW. And if you contacted them, they would refuse to change it even something like an MIT license, which basically emans the same thing, because they didn't understand what they were doing.

> And then Borg wasn't open sourced

This actually makes sense. Later on your suggest you were a Google SRE so you should be aware of this but to whoever else reads this: Google's traffic management was deeply integrated into the entire software stack. Load balancing, DDoS defense, inter-service routing, service deployment onto particular data centers, cells and racks and so on.

It just doesn't make sense to open source Borg without everything from global traffic management down to software network switching.

> I think only Blaze/Bazel is core infrastructure in which the open source version is actually genuinely the same codebase

I don't know the specifics but I believe that Bazel too was "Blaze inspired". I suspect it's still possible to do things in Blaze that you can't do in Bazel even though the days of Blaze BUILD files being Python rather than "Python syntax like" are long gone.

Also, Blaze itself has to integrate with various other systems than Bazel doesn' eg ObjFS, SrcFS, Forge, Perforce/Piper, MPM and various continuous build systems.

[1]: https://www.joelonsoftware.com/2002/06/12/strategy-letter-v/


Everything that the core stack depended on was written from scratch by Google, there were no third party dependencies with unknown licenses that I ever encountered, or any third party dependencies at all. They started with the STL + POSIX and worked up from there.

I'm pretty sure stuff could have been split out. The architecture was sound and the entanglement was overstated. Nothing would have stopped you bringing up a Borg cluster without the global HTTP routing / DDoS / traffic management stuff, as evidenced by the fact that those parts changed regularly without needing synchronized releases of the other parts.


I think there is a really nice opportunity to take some emerging open source standards such as CBOR for the wire format, CDDL for the schema definition and code generation inputs and WebTransport for the actual transport layer.


gRPC is deliberately designed not to be dependent on protobuf for its message format. It can be used to transfer other serialization formats. However, the canonical stub generator, which is not hard to replace at all, assumes proto so when people hear gRPC they really think of Protobuf over gRPC. Most of the complaints should be directed at protobuf, with or without gRPC.

The primary misfeature of gRPC itself, irrespective of protobuf, is relying on trailers for status code, which hindered its adoption in the context of web browser without an edge proxy that could translate gRPC and gRPC-web wire formats. That alone IMO hindered the universal applicability and adoption quite a bit.


Do you know of an example where this is done? I didn't know that and we are currently using a customized wire format (based on a patched Thrift), so I thought gRPC wouldn't be an option for us.


I have done it in proprietary settings. Nothing off the top of my head. The gRPC libraries themselves are pretty straightforward. You just need to use thrift IDL parser to output stubs that use gRPC under the hood.

The C++ one may be slightly more challenging to replace because extra care is needed to make sure protobuf message pipeline is zero-copy. Other languages are more trivial.

One place to start would be to look at the gRPC protoc plugin and see how it’s outputting code and do something similar. Pretty lean code.


This is the true design issue with gRPC as I see it. It would be way bigger without this. I love protobuf though, gRPC is just alright. At least gRPC makes it so much simpler to build powerful automation and tooling around it than the wild west of randomly created 'json'-ish REST-ish APIs.


> Why does gRPC have to use such a non-standard term for this that only mathematicians have an intuitive understanding of? I have to explain the term every time I use it.

Who are you working with lol? Nobody I’ve worked with has struggled with this concept, and I’ve worked with a range of devs, including very junior and non-native-English speakers.

> Also, it doesn’t pass my “send a friend a cURL example” test for any web API.

Well yeah. It’s not really intended for that use-case?

> The reliance on HTTP/2 initially limited gRPC’s reach, as not all platforms and browsers fully supported it

Again, not the intended use-case. Where does this web-browsers-are-the-be-all-and-of-tech attitude come from? Not everything needs to be based around browser support. I do agree on http/3 support lacking though.

> lack of a standardized JSON mapping

Because JSON has an extremely anaemic set of types that either fail to encode the same semantics, or require all sorts of extra verbosity to encode. I have the opposite experience with protobuf: I know the schema, so I know what I expect to get valid data, I don’t need to rely on “look at the json to see if I got the field capitalisation right”.

> It has made gRPC less accessible for developers accustomed to JSON-based APIs

Because god forbid they ever had to learn anything new right? Nope, better for the rest of us to just constantly bend over backwards to support the darlings who “only know json” and apparently can’t learn anything else, ever.

> Only google would think not solving dependency management is the solution to dependency management

Extremely good point. Will definitely be looking at Buf the next time I touch GRPC things.

GRPC is a lower-overhead, binary rpc for server-to-server or client-server use cases that want better performance and faster integration that a shared schema/IDL permits. Being able to drop in some proto files and automatically have a package with the methods available and not having to spend time wiring up url’s and writing types and parsing logic is amazing. Sorry it’s not a good fit for serving your webpage, criticising it for not being good at web stuff is like blaming a tank for not winning street races.

GRPC isn’t without its issues and shortcomings- I’d like to see better enums and a stronger type system, and defs http/3 or raw quic transport.


I use protobuf to specify my protocol and then generate a swagger/openAPI spec then use some swagger codegen to generate rest client libraries. For a proxy server I have to fill in some stub methods to parse the json and turn it into a gRPC call but for the gRPC server there is some library that generates a rest service listener that just calls into the gRPC server code. It works fine. I had to annotate the proto file to say what REST path to use.


>> Also, it doesn’t pass my “send a friend a cURL example” test for any web API.

> Well yeah. It’s not really intended for that use-case?

Until $WORKPLACE is invaded by Xooglers who want to gRPC all the things, regardless of whether or not there's any benefit over just using HTTPS. Internal service with dozens of users in a good week? Better use gRPC!


Oh yeah, no technology can design against being improperly deployed. I certainly don’t advocate for GRPC-ing-all-the-things! Suitable services only!


Hey, author here:

> Why does gRPC have to use such a non-standard term for this that only mathematicians have an intuitive understanding of? I have to explain the term every time I use it.

>> Who are you working with lol? Nobody I’ve worked with has struggled with this concept, and I’ve worked with a range of devs, including very junior and non-native-English speakers.

This is just a small complaint. It's super easy to explain what unary means but it's often infinitely easier to use a standard industry term and not explain anything.

>> Also, it doesn’t pass my “send a friend a cURL example” test for any web API.

> Well yeah. It’s not really intended for that use-case?

Yeah, I agree. Being easy to use isn't the indented use-case for gRPC.

>> The reliance on HTTP/2 initially limited gRPC’s reach, as not all platforms and browsers fully supported it

> Again, not the intended use-case. Where does this web-browsers-are-the-be-all-and-of-tech attitude come from? Not everything needs to be based around browser support. I do agree on http/3 support lacking though.

I did say browsers here but the "platform" I am thinking of right now is actually Unity, since I do work in the game industry. Unity doesn't have support for HTTP/2. It seems that I have different experiences than you, but I still think this point is valid. gRPC didn't need to be completely broken on HTTP/1.1.

>> lack of a standardized JSON mapping

> Because JSON has an extremely anaemic set of types that either fail to encode the same semantics, or require all sorts of extra verbosity to encode. I have the opposite experience with protobuf: I know the schema, so I know what I expect to get valid data, I don’t need to rely on “look at the json to see if I got the field capitalisation right”.

I agree that it's much easier to stick to protobuf once you're completely bought-in but not every project is greenfield. Before a well-defined JSON mapping and tooling that adhered to it is is very hard to transition from JSON to protobuf. Now it's a lot easier.

>> It has made gRPC less accessible for developers accustomed to JSON-based APIs

> Because god forbid they ever had to learn anything new right? Nope, better for the rest of us to just constantly bend over backwards to support the darlings who “only know json” and apparently can’t learn anything else, ever.

No comment. I think we just have different approaches to teaching.

>> Only google would think not solving dependency management is the solution to dependency management

> Extremely good point. Will definitely be looking at Buf the next time I touch GRPC things.

I'm glad to hear it! I've had nothing but execellent experiences with buf tooling and their employees.

> GRPC is a lower-overhead, binary rpc for server-to-server or client-server use cases that want better performance and faster integration that a shared schema/IDL permits. Being able to drop in some proto files and automatically have a package with the methods available and not having to spend time wiring up url’s and writing types and parsing logic is amazing. Sorry it’s not a good fit for serving your webpage, criticising it for not being good at web stuff is like blaming a tank for not winning street races.

Without looping in the frontend (aka web) it makes the contract-based philosophy of gRPC much less compelling. Because without that, you would have to have a completely different language for contracts between service-to-service (protobuf) than frontend to service (maybe OpenAPI). For the record: I very much prefer protobufs for the "contract source of truth" to OpenAPI. gRPC-Web exists because people wanted to make this work but they built their street racer with some tank parts.

> GRPC isn’t without its issues and shortcomings- I’d like to see better enums and a stronger type system, and defs http/3 or raw quic transport.

Totally agree!


> It's super easy to explain what unary means but it's often infinitely easier to use a standard industry term and not explain anything.

What's the standard term? While I agree that unary isn't widely known, I don't think I have ever heard of any other word used in its place.

> gRPC didn't need to be completely broken on HTTP/1.1.

It didn't need to per se (although you'd lose a lot of the reason for why it was created), but as gRPC was designed before HTTP/2 was finalized, it was still believed that everyone would want to start using HTTP/2. HTTP/1 support seemed unnecessary.

And as it was designed before HTTP/2 was finalized, it is not like it could have ridden on the coattails of libraries that have since figured out how to commingle HTTP/1 and HTTP/2. They had to write HTTP/2 from scratch in order to implement gRPC, so supporting HTTP/1 as well would have greatly ramped up the complexity.

Frankly, their assumption should have been right. It's a sorry state that they got it wrong.


> Hey, author here:

Hello! :)

>> Well yeah. It’s not really intended for that use-case?

> Yeah, I agree. Being easy to use isn't the indented use-case for gRPC.

I get the sentiment, for sure, I guess it’s a case of tradeoffs? GRPC traded “ability to make super easy curl calls” for “better features and performance for the hot path”. Whilst it’s annoying that it’s not easy, I don’t feel it’s super fair to notch up a “negative point” for this. I agree with the sentiment though-if you’re trying to debug things from _first_ principles alone in GRPC-land, you’re definitely going to have a bad time. Whether that’s the right approach is something is I feel like is possibly pretty subjective.

> I did say browsers here but the "platform" I am thinking of right now is actually Unity, since I do work in the game industry. Unity doesn't have support for HTTP/2. It seems that I have different experiences than you…

Ahhhh totally fair. To be fair I probably jumped the gun on this with my own, webby, biases, which in turn probably explains the differences in my/your next few paragraphs too and my general frustration with browsers/FE-devs; which shouldn’t be catching everyone else in the collateral fire.

> No comment. I think we just have different approaches to teaching.

Nah I think I was just in bad mood haha, I’ve been burnt by working with endless numbers of stubbornly lazy FE devs the last few places I’ve worked, and my tolerance for them is running out and I didn't consider the use-case you mentioned of game dev/beholden to the engine, which is a bit unfair. Under this framing, I feel like it’s a difficult spot: the protocol wants to provide a certain experience and behaviour, and people like yourself want to use it, but are constrained by some pretty minor things that said protocol seems to refuse to support for no decent reason. I guess it’s a possibly an issue for any popular-yet-specialised thing: what happens when your specific-purpose-tool finds significant popularity in areas that don’t meet your minimum constraints? Ignore them? Compromise on your offering? Made all the worse by Google behaving esoterically at the best of times lol.

You mentioned that some GRPC frameworks have already moved to support http/3, do you happen to know which ones they are?


This is probably not exhaustive but I think these frameworks can support HTTP/3 today:

- The standard grpc library for C#, dotnet-grpc

- It may already be possible in rust with Tonic with the Hyper http transport

- It's possible in Go if you use ConnectRPC with quic-go

- This is untested but I believe many gRPC-Web implementations in the browser might "just work" with HTTP/3 as well as long as the browsers are informed of the support via the "ALT-SVC" header and the servers supports it.


> Yeah, I agree. Being easy to use isn't the indented use-case for gRPC.

Sick burn. I like it, especially since most use of gRPC seems to be cargo-culting.


For me, the breaking point was when I saw the C++ bindings unironically recommend [1] that you use terrible anti-patterns such as "delete this". I find it unlikely that all these incredibly well paid Google engineers are unaware how people avoid these anti-patterns in idiomatic C++ (by e.g. std::shared_ptr). The only remaining sensible explanation is that Google internal gRPC C++ tooling must have utilities that abstract away this ugly underbelly, which us mere mortals are not privy to.

[1] https://grpc.io/docs/languages/cpp/callback/


I do generally agree the tooling sucks, but as mentioned, buf and the connectrpc ecosystem have made it much easier to get things going.


> Bad tooling

Lolwut. This is what was always said about ASN.1 and the reason that this wheel has to be reinvented periodically.


It can be true for both ASN.1 and gRPC? Moreover, definitions of "bad" can vary.


Before inventing a new serialization protocol it would be good to first study the field and pick an existing protocol that ticks all the right boxes, and if the tooling isn't very good then write new tooling -- you'd have to write new tooling for a new protocol anyways, but if you can find a good enough existing one then you don't also have to write a spec, thus saving you a lot of time.


in my latest project I actually needed an rpc library that was hardware accelerated and I was surprised gRPC doesn't do RDMA for example. why is that?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: