API design is stuck in the past

marta_morena_28 · on Nov 14, 2020

> Two decades ago, it was widely argued that dynamic programming languages were more productive because you didn't have to spend time dealing with type signatures. The only reason, then, to use a statically typed language, was for better performance.

This boggles the mind. I am using typed languages since I can think. I never once recall an instant where I was saying "Uh uh, this type signature is driving me craaazy". Like seriously, I don't even know what articles like this are talking about. Types are not your enemy, they surface issues early. Yeah you can whip something up in Python and JS, but ultimately you DO have to deal with types, except now you don't have a compiler doing this job anymore, you have to do it yourself... Somehow.

The only thing typed languages need is something like `dynamic` from C#, which automatically boilerplates untyped access with reflection, without cluttering your code. I.e. duck typing is one of the things some languages like Java need to get better at. But the situations in which I yearn for this are far and few between.

You don't think about types, unless you are new to typed languages... It's really that simple. I never have to think about types. Perhaps its subconcious, but its definitely not slowing me down, its making things faster through robust refactoring, auto-complete and welll doh: TYPE SAFETY!

dpc_pw · on Nov 14, 2020

Languages like Java (which was THE statically typed language around that time) were extremely verbose and inexpressive. It's hard to write anything in Java without metric tons of boilerplate and repeating the type of every value over and over and over again.

On top of it, with OOP languages it's very easy to box the code in deep layers of ridiculous inheritance taxonomies, making it pretty much impossible to refactor anything after business requirements changed. Barely any escape hatches and absolutely no flexibility.

And type-safety is kind of pointless when everything is of a given type "OR NULL".

I was always proponent of typing but after working with Java, I can totally see why so many people back in the day considered static typing as not worth it.

nicoburns · on Nov 14, 2020

Statically typed languages without Sum Types (aka tagged unions aka enums) which includes Java, C# and C++ amongst others drive me crazy: they have no ergonomic way to express "or" types. This is a massive expressive hole which (along with lacking type inference) I believe is responsible for a lot of the hate towards statically typed languages.

bigyikes · on Nov 14, 2020

This is one of the joys of TypeScript.

Some of my coworkers complain about being required to use TS instead of JS, and I just wonder why in the world you would want to use JS in a massive codebase.

IshKebab · on Nov 14, 2020

It is, although the syntax for proper sum types (it calls them "discriminated unions") is really verbose in Typescript. I wish they'd make it more terse so people didn't use untagged sum types and hacks like `typeof` all the time.

lolinder · on Nov 14, 2020

Sealed classes in Kotlin largely solve that problem for me, and Java is getting those. It's not quite the same, but most of the time I find that if I'm trying to do Int|String, it's primitive obsession and there is actually a better sealed type hierarchy I'm missing.

flohofwoe · on Nov 14, 2020

If you care about the exact underlying memory layout such high-level types are way too blackbox-y. I doubt that historically this specific feature was responsible for any "static typing hate" (because languages with such high-level type systems were quite obscure 10 or 20 years ago). I have the impression that this hate was specifically a web-dev thing because many Javascript developers never experienced what it's like to work with a statically typed language until Dart and Typescript showed up (and then it suddenly was the best thing since sliced bread).

jmfldn · on Nov 14, 2020

> Statically typed languages without Sum Types (aka tagged unions aka enums) which includes Java, C# and C++ amongst others drive me crazy: they have no ergonomic way to express "or" types. This is a massive expressive hole which (along with lacking type inference) I believe is responsible for a lot of the hate towards statically typed languages.

Union types coming in Scala 3

valenterry · on Nov 14, 2020

Yes, totally agree. It's the lack of statical type systems like this one that makes people hate them - for good reasons.

NOGDP · on Nov 14, 2020

These languages have far better type systems than languages like python or javascript, that's not a reason to hate them.

valenterry · on Nov 14, 2020

We are talking about user experience here. In that regard, comparing python/javascript with Java/C++ is comparing apples with oranges.

NOGDP · on Nov 14, 2020

We are talking about Python and JavaScript from the top of this comment chain, and the 'user experience' of writing Java/C# is closer to Python and JavaScript than to, say, Haskell.

nicoburns · on Nov 14, 2020

Yes, but I find that the user experience of writing Rust or Swift is closer to that of JS than C# and Java.

NOGDP · on Nov 15, 2020

Why? Rust's type system is basically a more sophisticated version of Java's, JS is in the opposite direction - a much simpler dynamic type system. Rust's lifetimes and borrow checker is additional complexity that JS doesn't have. Rust has longer compilation times than JS, longer than Java. Etc.

nicoburns · on Nov 15, 2020

>Why? Rust's type system is basically a more sophisticated version of Java's, JS is in the opposite direction - a much simpler dynamic type system.

I would argue that a sophisticated type system is closer to dynamic typing than a simple type system. A type system is like guard rails that prevent you from doing certain things. A sophisticated type system gives you more freedom and possibilities than a simple type system, and hence is closer to a dynamic type system without the guard rails at all.

NOGDP · on Nov 15, 2020

Java gives you a decent way to build guard rails, Rust gives you a somewhat better way, JS doesn't give you any way at all.

A more sophisticated type system gives you more elegant ways of expressing constraints, with increased language complexity. Java falls in the middle with a fairly simple language and type system. Any code which you do not know how to structure within Java's type system can be written with some Objects/Maps/Collections and a bit of runtime logic - basically giving you what you'd do in JavaScript, though Java's type system is sufficiently powerful for 99% of real-world use cases.

The main issue I see Python and JS programmers face when coming to a statically typed language like Java is the additional complexity of a type system. Saying that a more complex type system would somehow be more familiar is just backwards.

valenterry · on Nov 15, 2020

> Any code which you do not know how to structure within Java's type system can be written with some Objects/Maps/Collections and a bit of runtime logic

Yeah, but that means jumping through hoops because of the lack of the typesystem. And that's what many people don't like, hence the whole thread. For you that might be fine, but please understand that there are other people out that who are not okay with it so easily.

> Java's type system is sufficiently powerful for 99% of real-world use cases

Rather the opposite. Every big project that uses reflection/introspection or annotations or some kind of code generation tooling shows that the typesystem is not sufficient. Yeah, there are some cases where the above techniques were used and could have been avoided (while keeping typesafety), but often they are just required.

And then Java does not even have proper sumtypes or union types (enums only work when the structure is identical and I mean... we could count some strange workarounds with static classes and private constructors that pretty much no one uses due to horrible ergonomics). And these literally appear everywhere.

I diagnose that you are suffering from the famous http://wiki.c2.com/?BlubParadox

NOGDP · on Nov 15, 2020

> Any code which you do not know how to structure within Java's type system can be written with some Objects/Maps/Collections and a bit of runtime logic Yeah, but that means jumping through hoops because of the lack of the typesystem. And that's what many people don't like, hence the whole thread. For you that might be fine, but please understand that there are other people out that who are not okay with it so easily.

Not jumping through hoops... the point is that you can write untyped code in Java similar to JS with similar complexity. If you really think JS is somehow better in this regard then writing horrible poorly-typed Java is not very different.

> Rather the opposite.

The opposite as in Java is arguably the most relied-upon language for enterprise-grade backend code, because it lacks 99% of features people would want? Okay.

> Every big project that uses reflection/introspection or annotations or some kind of code generation tooling shows that the typesystem is not sufficient.

Annotations and reflection are a feature of Java, they are not external to the language. Annotation and code generation are separate features from the type system - Rust's code generation and annotations are very commonly used. Reflection is equivalent to runtime typechecking that is common in JS. How can you say Java is worse then JS in this regard when the poor parts you point out are basically what JS does?

> And then Java does not even have proper sumtypes or union types (enums only work when the structure is identical and I mean... we could count some strange workarounds with static classes and private constructors that pretty much no one uses due to horrible ergonomics). And these literally appear everywhere.

First of all JS and Python do not have these either, so saying that Java is somehow worse in this regard is ridiculous. Furthermore the usefulness of sum types is fairly limited - what problem are you trying to solve with sum types in Java? Implementing an Either<A,B> is trivial in Java.

> I diagnose that you are suffering from the famous http://wiki.c2.com/?BlubParadox

I think that applies to you a lot more. You're criticising Java for giving you a whole bunch of features which don't exist in Python or JS, while also saying the language sucks compared to Python/JS because it doesn't have features of Haskell. The fact that you think JS is somehow closer to Rust than Java makes me think you have very limited experience with these languages.

valenterry · on Nov 16, 2020

> Not jumping through hoops... the point is that you can write untyped code in Java

Sure, but that already means you are jumping through hoops.

> > > Java's type system is sufficiently powerful for 99% of real-world use cases

> > Every big project that uses reflection/introspection or annotations or some kind of code generation tooling shows that the typesystem is not sufficient.

> Annotations and reflection are a feature of Java, they are not external to the language. Annotation and code generation are separate features from the type system - Rust's code generation and annotations are very commonly used. Reflection is equivalent to runtime typechecking that is common in JS. How can you say Java is worse then JS in this regard when the poor parts you point out are basically what JS does?

Well, maybe I misunderstood you. And with "sufficiently powerful" you just meant "someone can kinda use it to build something". Well then, yes. Saying it just doesn't make much sense to me in a discussion about ergonomics where people complain about type system limitations.

> Implementing an Either<A,B> is trivial in Java.

Okay, let me copy&paste how this can be defined in F#:

    type Result<'TSuccess,'TFailure> = 
        | Success of 'TSuccess
        | Failure of 'TFailure

or maybe a language closer to Java, here it is in Scala3:

    enum Either[A, B] {
      case Left[A](a: A)  extends Either[A, Nothing]
      case Right[B](b: B) extends Either[Nothing, B]
    }

I'm curious to see the "trivial" implementation in Java that equals the ones from F# and Scala. Mind that both solutions I gave allow to add a "fold(left -> handleLeft(...), right -> handleRight(...))" function which allows to manipulate the content, depending on what it is _without using any casts or reflection_. This is possible in Java, but I don't know any "trivial" solution.

NOGDP · on Nov 16, 2020

> Sure, but that already means you are jumping through hoops.

Map<String,Object>, Object, a few other things you may need are standard Java, so not sure how this is 'jumping through hoops'. It's not necessarily more complicated, just not idiomatic java - the point is you CAN write shitty JS-style code if you want, how is that an argument for why JS is somehow better than Java?

> Well, maybe I misunderstood you. And with "sufficiently powerful" you just meant "someone can kinda use it to build something". Well then, yes.

Are you not aware that many prominent tech companies have a significant Java stack? Google, Amazon, Uber, Airbnb, Netflix, etc? Are you not aware of major open source Java projects such as kafka, elasticsearch, hadoop, android sdk, etc? What point are you even trying to make?

> Saying it just doesn't make much sense to me in a discussion about ergonomics where people complain about type system limitations.

What doesn't make sense is saying that Java's type system makes it a worse language than JS or Python, or that JS or Python are closer to Rust/Haskell.

> I'm curious to see the "trivial" implementation in Java that equals the ones from F# and Scala. Mind that both solutions I gave allow to add a "fold(left -> handleLeft(...), right -> handleRight(...))"

Here you go:

  class Either<L,R>
  {
      public static <L,R> Either<L,R> left(L value) {
          return new Either<>(Optional.of(value), Optional.empty());
      }

      public static <L,R> Either<L,R> right(R value) {
          return new Either<>(Optional.empty(), Optional.of(value));
      }

      private final Optional<L> left;
      private final Optional<R> right;

      private Either(Optional<L> l, Optional<R> r) {
        left=l;
        right=r;
      }
  }

Yes, it's longer and slightly more complicated, mainly because Java doesn't have pattern matching, and yes you can add typesafe fold and map functions to it without reflection. That being said, you gave me examples in languages with more sophisticated type systems than Java - these say absolutely nothing about why Java is worse than Python or JS.

valenterry · on Nov 17, 2020

> Map<String,Object>, Object, a few other things you may need are standard Java, so not sure how this is 'jumping through hoops'

When you put various things into this map and then later get them out and want to work with them, you will have to cast them to be able to do anything useful with them.

> the point is you CAN write shitty JS-style code if you want, how is that an argument for why JS is somehow better than Java

For the sake of our discussion: I have never said that JS were somehow better than Java. I much prefer statical type systems and would always pick Java over JS for a personal non-browser projects. But that's not the point of this discussion, so I'm playing "devil's advocate" here. It's important to understand and accept the shortcomings of statical type-systems - that's what I try to explain here.

> What doesn't make sense is saying that Java's type system makes it a worse language than JS or Python

You need to re-read what I (and the others in this subtread) have written. It is completely valid to criticize one part of language X compared to language Y without implying that this language X is worse than another language Y overall.

> [Java Either implementation]

> Yes, it's longer and slightly more complicated

And not only that, it is also not equivalent to the F# / Java examples. Or if it tries to be equivalent, it is buggy.

E.g.:

    Either.left(null)

Now I have an Either that is neither left nor right. Compared to the Scala example (because Scala also has to deal with the existence of null):

    Left(null)

This will create an instance of the type Left which contains a null-value. As I said, if I add a `.fold` method, then it will fold over the null. E.g.:

    Left(null).fold(left => "Left value is " + left, right => "Right value is" + right)

This would return the String "Left value is null". You can't do this with your example implementation in Java, because the information is lost.

It is _not_ trivial to do that in Java, even when relying on already similar functionality like the built-in Optional type.

NOGDP · on Nov 17, 2020

> When you put various things into this map and then later get them out and want to work with them, you will have to cast them to be able to do anything useful with them.

Considering this is entire hypothetical is a edge case, that's a minor inconvenience.

> But that's not the point of this discussion, so I'm playing "devil's advocate" here. It's important to understand and accept the shortcomings of statical type-systems - that's what I try to explain here.

That is the point of the discussion, the original claim I was objecting to was:

'This is a massive expressive hole which (along with lacking type inference) I believe is responsible for a lot of the hate towards statically typed languages.'

You're pointing out weaknesses in a subset of statically typed languages, and these are only weaknesses when compared to better type systems - not when compared to dynamically typed languages. I never claimed that Java had a perfect type system - I prefer Haskell and Rust.

> You need to re-read what I (and the others in this subtread) have written. It is completely valid to criticize one part of language X compared to language Y without implying that this language X is worse than another language Y overall.

It's not valid when you're using Rust or Haskell to show weaknesses in Java relative to JS. The original context was Java/C#/C++ vs Python/JS.

> Now I have an Either that is neither left nor right.

You're right. Here's a simple example without this problem:

  class Either<L,R>
    {
        public static <L,R> Either<L,R> left(L value) {
            return new Either<>(value, null, true);
        }

        public static <L,R> Either<L,R> right(R value) {
            return new Either<>(null, value, false);
        }

        private final L left;
        private final R right;
        private final boolean isLeft;

        private Either(L l, R r, boolean isLeft) {
          left = l;
          right = r;
          isLeft = isLeft;
        }
    }

> It is _not_ trivial to do that in Java, even when relying on already similar functionality like the built-in Optional type.

It is trivial, it's just more awkward and lengthy but not complex at all - also no Optional. Plus there are stable libraries providing types like Either<A,B>, and other functional language features. Anyway, I'm not here to defend Java type system against Haskell, my point is that Java type system is a huge feature when compared to JS or Python.

valenterry · on Nov 17, 2020

> Considering this is entire hypothetical is a edge case, that's a minor inconvenience.

I believe this is not an edgecase. I have to deal with that almost everyday and I'm working with a language that has a much more advanced typesystem than Java. But I guess there is no hard data for that, so everyone can believe what they what. :)

> You're right. Here's a simple example without this problem:

If it's so trivial, then why do you even have to fix something in your first approach. Also, you second approach still has flaws and is not equivalent. Maybe you want to figure it out yourself this time? :)

Anyways, I guess we are talking different points. Have a nice day!

valenterry · on Nov 15, 2020

First of all, of course Rust hast some additional complexity because it is close to bare metal. But if you think this complexity away (to make it comparable to e.g. javascript), here are some reasons:

1) Better type-inference. In Java this has improved but is still much more clunky and boilerplatey. Good type-inference is important to not annoy the user.

2) Traits / type-classes. They enable a way of programming that comes much closer to duck-typing and avoid wrapping your objects in wrapper-classes to support interfaces like you are forced to do it in Java.

3) Better and less noisy error handling (looking at you Java, Go, C++ and most other languages)

NOGDP · on Nov 15, 2020

You can't 'think away' the additional complexity of lifetimes and the borrow checker though. It's something you have to understand and keep in mind.

> Better type-inference. In Java this has improved but is still much more clunky and boilerplatey. Good type-inference is important to not annoy the user.

In my opinion, Java without type inference is fine - it's very minor issue, and there is a fairly limited scope of code that would actually benefit from type inference in terms of quality/readability. If you use a decent editor most of the redundant typing is auto-completed anyway.

> Traits / type-classes. They enable a way of programming that comes much closer to duck-typing and avoid wrapping your objects in wrapper-classes to support interfaces like you are forced to do it in Java.

Eh, Rust traits are better than Java's interfaces, but you can implement multiple interfaces for your own objects in Java without any wrappers. The issue is extending external objects to support new interfaces. Plus, the point is to have correct code defined and checked at interface/trait boundaries, something JS doesn't do at all.

> Better and less noisy error handling (looking at you Java, Go, C++ and most other languages)

The error messages for Rust can be much more complex than Java, and are probably more complex on average, simply because it's a more complex language and type system.

I would say the benefits of Java's type system far outweigh the imperfections and tiny costs when compared to a language like JS.

valenterry · on Nov 15, 2020

> You can't 'think away' the additional complexity of lifetimes and the borrow checker though

Of course you can't. But the problem of lifetimes does not go away, not matter if you use statical or dynamic typing. However, in javascript this problem does not exist, so obviously that can't be compared to Rust. If you use Rust, then because you _need_ this for performance.

> In my opinion, Java without type inference is fine

Fair enough, but most people see that very different, hence the unhappiness.

> Eh, Rust traits are better than Java's interfaces, but you can implement multiple interfaces for your own objects in Java without any wrappers. The issue is extending external objects to support new interfaces.

That's exactly what I said or at least meant. :)

> The error messages for Rust can be much more complex than Java

No no, not the error messages that the rust compiler gives you. I'm talking about error handling that the developer does.

> I would say the benefits of Java's type system far outweigh the imperfections and tiny costs when compared to a language like JS.

I agree, but just because the benefits outweight the problems, that doesn't mean people will be frustrated by these problematic parts. And let's not call it imperfections. Java is _so_ far away from perfection, that just gives your post a sarcastic touch.

NOGDP · on Nov 15, 2020

> However, in javascript this problem does not exist, so obviously that can't be compared to Rust.

Static typing and a whole bunch of other things don't exist in JS as well. The point of a comparison is to highlight the differences and similarities. You were the one who said Rust is more similar to JS than Java, you can't just reduce the language to 'type inference and traits' - things JS doesn't have at all and say it's somehow similar to JS.

> Fair enough, but most people see that very different, hence the unhappiness.

You have data supporting this 'most people see it...' argument? Or you just made it up on the spot?

> No no, not the error messages that the rust compiler gives you. I'm talking about error handling that the developer does.

Error handling in Java is very straight forward and it's far more similar to JS than Rust is.

> And let's not call it imperfections. Java is _so_ far away from perfection, that just gives your post a sarcastic touch.

Java is a great language. It's not overly complex, it's fast, it has great tooling and IDE support, it has one of the largest library ecosystems. It has a huge developer community, many high profile projects, many high profile companies use it. It's easy to find decent Java developers for your project. It has a decent type system - far better than JS or python. From a pragmatic point of view Java is one of the best languages in existence.

valenterry · on Nov 16, 2020

Okay, I think it was not a good idea to compare these languages from the beginning. I don't think this discussions leads anywhere.

NOGDP · on Nov 14, 2020

Modern versions of these languages have these features to a varying degree.

junon · on Nov 14, 2020

std::variant or std::any don't count? Overloading doesn't count? Templates don't count?

Seems like you should play with C++ a bit more.

valenterry · on Nov 14, 2020

Not a C++ developer, but from what I read, it says: "A variant is not permitted to hold references, arrays, or the type void". These are quite some limitations and don't really give a "smooth" experience.

junon · on Nov 15, 2020

None of those types make sense in variants. This is obvious to C++ programmers.

With all due respect, if you're not qualified to make assertions about something, perhaps refrain from labeling it as "quite the limitation".

valenterry · on Nov 15, 2020

Can you explain why it makes no sense for it to hold an array or void? I'm really curious and will take back my claim.

junon · on Nov 15, 2020

You cannot trivially compare or move an array. Note that `std::array` is still allowed in `std::variant` - just not C-style arrays.

As for references, you cannot re-bind references. Rationale here: https://stackoverflow.com/a/27037871/510036

As for void, apparently the reasons I had in my head are not the reasons in real life. I thought it might be because of a destructible requirement on the type, but it turns out there really isn't a good reason why they disallowed it, and that a future standard might allow it.

In any event, there are a multitude of variant implementations that allow all sorts of things depending on the behavior you want. Nothing is forcing you to use the standard library.

valenterry · on Nov 16, 2020

Thank you for the explanation!

I just wonder why it is hard to make a variant-type that works with everything. Well, if the language prevents e.g. reference rebinding, there can't be done much.

But not being able to put _anything_ into a variant severely limits the way it can be used for abstraction. Especially for library authors, because they might not no what their users will pass them. So when they write a generic method, that uses variants under the hood, they would have to "pass down" the restrictions to the user and tell them not to pass e.g. void. Same for the interaction of two libraries.

Or am I misunderstanding the constraints here?

junon · on Nov 16, 2020

> I just wonder why it is hard to make a variant-type that works with everything.

Because standard C++ has to work for the general case. It's specifically designed so that more pointed or specific implementations that have other concerns (e.g. supporting C-style arrays or void) can do so, accepting the runtime penalties if desired.

> reference rebinding, there can't be done much.

References are syntactic sugar over pointers at worst, and a means of optimization at best. C++ is a pass-by-value language first and foremost, and goes to great lengths to keep things as optimizable as possible when it comes to standardization.

Again, variant supports pointers just fine. It also supports smart-pointers just fine. There's nothing preventing you from using those.

Remember that C++ has to work across all architectures, platforms, etc. Not everything handles references the exact same. Compilers are afforded many liberties when it comes to them in order to optimize for the target machine.

> But not being able to put _anything_ into a variant severely limits the way it can be used for abstraction.

Aside from `void`, I disagree. Like I said before, you can implement your own variant quite easily if you want those things. There are decent reasons (except for `void`) not to include them in the standard.

> Especially for library authors, because they might not no what their users will pass them.

I'm not so sure I understand what you mean here. Templates tell library authors /exactly/ what will be passed to them.

> So when they write a generic method, that uses variants under the hood, they would have to "pass down" the restrictions to the user and tell them not to pass e.g. void.

They don't have to tell the user anything. The compiler will inform them void is not allowed if a type cannot be compiled.

> Or am I misunderstanding the constraints here?

Yes. Most of what std::variant does in terms of type checking happens at compile time. Unless a program has been modified after compilation (which should never be the case), there's no possible way for the "wrong type" to be passed to a variant at runtime, because the assortment of possible types have been checked at compile time.

---

EDIT: I just realized why `void` may not be included, though I admit it's speculation.

`void` is not allowed as a function argument type; it is not equivalent to e.g. `decltype(nullptr)` and simply is the lack of type.

Therefore, there is no valid specialization of `operator=(const T&)` that would accept a "void type" because there is no way to implicitly invoke `operator=(void)` (you'd have to call, literally, `some_variant_object.operator=()`, which is very un-C++).

The alternative would be to have a method akin to `.set_void()`, and it could only be enabled if `void` was one of the types passed to the template parameter pack - and, if it is the only type passed to the parameter pack, all other `operator=()` overloads would have to be disabled.

This is an incredibly confusing API specification that I can understand if never included in the standard.

Note that, in this case, there'd be a difference between "no value" (null) and a "void value" (not-null), which is overly confusing and, again, very un-C++ (or un-C for that matter).

If this is the rationale, it makes a lot of sense and I agree with it. If I need a variant that supports `void`, I'd probably write my own anyway because there's probably a domain-specific use case.

edflsafoiewq · on Nov 14, 2020

std::variant is basically a parody of everything wrong with modern C++.

junon · on Nov 15, 2020

Please elaborate because I strongly disagree with you.

coldtea · on Nov 14, 2020

>This boggles the mind. I am using typed languages since I can think. I never once recall an instant where I was saying "Uh uh, this type signature is driving me craaazy".

You might not, but this was a common sentiment (not saying it is necessarily a valid one, mind you, but it was common). Were you programming 2 decades ago and/or paying attention to the average sentimeντ expressed in blogs/etc re types and dynamic languages (and the general tone up to around 2012 or so even in HN)?

Another common sentiment was that "who needs types when you have TDD".

taneq · on Nov 14, 2020

The people who have problems with types and think removing them makes programming easier are the same people who have problems with syntax and think that replacing text with some kind of graphical representation makes programming "easy for non-programmers".

mjcohen · on Nov 14, 2020

Without types, TDD can become TTD.

randomdata · on Nov 14, 2020

> I never once recall an instant where I was saying "Uh uh, this type signature is driving me craaazy".

Is that sentiment based on modern languages, though?

While modern in its place in history, but adhering to older principles, I frequently hear exactly that from people evaluating Go. Languages with more complex type systems bring tools to help alleviate those concerns. Not all of those concepts were widely available looking back two decades ago. Java, for example, which was probably the most popular typed language of that time did not provide even generics until 2004. Being able to write a function once and use it for many (dynamic) types was no doubt seen as a big improvement for many use cases.

Type systems are back in fashion now largely because they are much more usable now, especially outside of the academic languages.

mdoms · on Nov 14, 2020

> This boggles the mind. I am using typed languages since I can think. I never once recall an instant where I was saying "Uh uh, this type signature is driving me craaazy". Like seriously, I don't even know what articles like this are talking about.

This was OVERWHELMINGLY the sentiment on hacker news a decade ago. Strongly statically typed languages were NOT WELCOME on this website.

danenania · on Nov 14, 2020

A lot of that can probably be attributed to major advances in type system ergonomics. A verbose, clunky type system with poor inference can easily be worse than none at all. Ten years ago, there just weren't that many popular, practical languages with really good type systems. Now there are quite a few.

pjmlp · on Nov 14, 2020

I share the same sentiment, the only dynamic language I put up with is JavaScript, and PHP when dealing with my own site.

Tcl, Smalltalk, Lisp, Prolog, while great to program in the small, have taught me that I really want types when working in a team.

Python, Ruby and Perl, I really don't see the use beyond learning to program or grown up shell scripts, given their lack of attention to performance.

Daishiman · on Nov 14, 2020

The vast majority of code ever written is not performance-sensitive, but is very sensitive to getting produced rapidly.

If you need to quickly perform some statistical analysis, you'd be hard pressed to be more productive in anything else over Python+NumPy or R.

pjmlp · on Nov 14, 2020

Statistical analysis is a domain I am more than happy not to care about, it was already enough what I had to bare with during my engineering degree.

Still, given that Python and R are just glue languages for C, C++ and Fortran libraries, I rather use the source directly, or bindings to typed languages.

Modern C++, .NET, Java or ML based languages are just as effective,.

And here is a fun fact, I have spent 4 years working for life sciences companies, where several researchers I got to know, would do statistical analysis in Excel + VBA, eventually using VB.NET as well for more complicated stuff.

kumarvvr · on Nov 14, 2020

Where dynamic typing helps a lot is for creating frameworks, like Django, RoR, etc.

Because the framework is working on a level above the application, it can easily deal with objects without worrying about what is inside them.

However, people got caught up in this no-type nonsense and took it to all corners of every app development.

When you are creating, say web apps, you may not create types in dynamic languages like python, but you sure as hell will use known variables in those types.

There are a few edge cases, where dynamic types allow you to build logic on user-defined sets of data, but those cases are few and far between. Even those can be solved using generic data containers or custom data protocols such as XML.

valenterry · on Nov 14, 2020

> Because the framework is working on a level above the application, it can easily deal with objects without worrying about what is inside them.

Even for that there is no need for dynamic typing anymore. This problem has been solves with type parameters (aka generics) and type-classes.

Daishiman · on Nov 14, 2020

Not true.

In Python and Ruby it's an extremely common pattern to return a dictionary with a half-dozen entries at most that will be consumed at a single location.

Defining an entire class for this sort of extremely common use case is for the most part a waste of time.

These languages allow for the formalization of those types by creating classes out of them, but looking at what 50% of my functions do in web dev code, they're returning tuples, small dictionaries, or standard library objects.

valenterry · on Nov 15, 2020

What makes you think that statically typed languages don't have (type-safe) tuples or dictionaries?

Maybe you can give a minimal code example of what you mean and I'll show you how I would solve that in a statically typed way. :)

Daishiman · on Nov 17, 2020

Which mainstream language supports anonymous structural types? C# is the only one that comes to mind and it's not a standard pattern AFAIK.

valenterry · on Nov 17, 2020

Wait a moment! Since when are we talking about only mainstream languages? And how is that even defined?

That's not fair play

Daishiman · on Nov 17, 2020

Language features are a secondary consideration when compared to ecosystem and library availability as far as getting actual stuff done.

I never claimed those languages don't exist, but a language is a full package, and right now I'm not seeing even ascending languages as supporting trivial structural return types as an idiom. Granted, this is mostly a personal pet peeve given my observations and usage of code.

valenterry · on Nov 18, 2020

> I never claimed those languages don't exist

Yes you did. Let me quote:

Me: > Even for that there is no need for dynamic typing anymore. This problem has been solves with type parameters (aka generics) and type-classes.

This is obviously a general statement. I means "there can be a programming language where there is no need for dynamic typing to solve this kind of problem". And then I continue, that this problem has been solved. That means there is at least one such a language already existing which solves this problem with certain techniques.

Then you: > Not true.

And once I offer you a concrete implementation as an example, you suddenly change the topic to "but... no mainstream language". And if I would present you a language that could be considered as mainstream, I'm sure you would find another restriction such as "but this language does not have enough... libraries".

There certainly is no true Scotsman for you.

magicalhippo · on Nov 14, 2020

> Types are not your enemy, they surface issues early.

Not only that, but they document the code.

When working with a new framework or library in Python or JavaScript I never know what I can do, I have to look at the documentation constantly.

Not seldom I'm still left scratching my head or doing stuff like "print(dir(result))" to figure out what I can do with whatever that function returned.

With a static typed language I can see what type the function expects and what it returns. If I don't know a type I can discover what it can do in a few clicks in my IDE.

mbesto · on Nov 14, 2020

> The only reason, then, to use a statically typed language, was for better performance.

Uhh, someone please correct me if I'm wrong, but aren't statically typed languages about reliability (e.g. testing, mutability)?

EDIT: in addition to performance, not solely.

marta_morena_28 · on Nov 14, 2020

Statically typed languages only eliminate SOME things you'd otherwise have to test. In the end, you can eliminate almost anything besides logical errors, however, those are unfortunately a pretty big portion of bugs :D.

So while I would always use statically typed languages for anything that needs to be reliable, I do not see how this is in any way a necessity. You CAN write reliable programs without type safety, you just have to test things you normally wouldn't have to test (i.e. the lack of type safety introduces a whole bunch of null-pointer style scenarios where you get something your code totally didn't expect but has to deal with).

As for performance. Statically typed languages are usually faster, mostly because we do not have the technology yet to make dynamically typed ones as fast (in the general case). Not because there is something inherently different about them.

However, I imagine the technology to make them on par with statically typed languages will take another few decades. Mainly because untyped languages need sophisticated transformations to be fast. That is the job the human normally does for the compiler in typed languages. Things just fit together and play nicely. With dynamic languages, your get one big spaghetti soup with no structure and now the compiler has to figure out the "intended" types and compile for different estimated function signatures, etc. all while honoring the JIT-style performance considerations (fast startup, low overhead during runtime). This is a gargantuan task that probably will require advanced machine learning before it really takes off.

mbesto · on Nov 14, 2020

> I do not see how this is in any way a necessity.

Right, but that's not my point. My point is that using a static type language isn't JUST about performance. If I define foo as a string and call it as an integer in my program, my compiler is going to catch it, whereas in a dynamically type language I may not discover the bug until it hits production.

hedora · on Nov 14, 2020

Statically typed languages typically have better performance.

In C, i++ is one or two machine instuctions. In javascript, we don’t know if i is an int, or something that overrode ++ to download wikipedia. So, it ends up being a function call in naive JavaScript. Fast forward a decade, and dozens of PhD theses mean that ++ is usually a machine instruction in JavaScript, but it is not guaranteed.

zdragnar · on Nov 14, 2020

Two decades ago, dynamic languages were more typically scripted / interpreted, and statically typed languages were more typically compiled. Scripting does allow for quicker iteration (especially if compared to a language with a slow compiler) and compilation usually does produce faster code (at least pre-dating sophisticated JIT compilers).

d0mine · on Nov 14, 2020

A dynamic language is not just a statically-typed language where all type labels are removed -- it is the wrong mindset -- don't try to write Java programs using Python syntax.

On type safety: Python is a strongly typed language unlike e.g. C:

  >>> "%d+%d=%d" % (2, '2', 4)
  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
  TypeError: %d format: a number is required, not str

vs:

  #include <stdio.h>
  
  int main(void)
  {
    return printf("%d+%d=%d\n", 2, "2", 4) < 0; 
  }

tomc1985 · on Nov 14, 2020

Funny, that's one of the things that drives me nuts about Python. It knows each types, it knows function to convert to each type, but... it doesn't do the conversions! So frustrating! Why do I have to care about types in these dynamically typed languages??

Also, is sprintf-style string formatting the best example here? I think that feature is type strict in a lot of languages, after all you are declaring the types you want in the formatting string. I imagine most implementations of % in dynamic languages pass to sprintf internally?

imtringued · on Nov 14, 2020

>It knows each types, it knows function to convert to each type, but... it doesn't do the conversions! So frustrating!

It surprises me that you are incapable of thinking one step ahead. Implicit type conversions undermine the type system and make your language completely unpredictable. Pretty much everyone regrets this feature in C++ and it is still a huge source of bugs because you have to opt out of it.

You want things to fail with a loud bang instead of continuing and destroying things along the way.

tomc1985 · on Nov 14, 2020

> It surprises me that you are incapable of thinking one step ahead.

Seriously, what is this? You don't know me. Lay the fuck off.

My thoughts on the matter are either that a language should be strongly typed, with type declarations and enforcement at the compiler level, or dynamically typed in such a way that I shouldn't have to worry about types except for specific circumstances. Dynamically typed languages should know their coercion capabilities and perform them losslessly when needed. In reality what ends up happening is that some type coercions are automatic and some aren't, and if you're a polyglot then this is yet another one of those stupid arbitrary details you have to memorize for each language you work with. (I really would like less of those, there are too many different languages for doing the same thing)

One of the benefits of dynamic languages is that they handle type stuff for you. Which I interpret as, "cool, I don't have to worry about types!"

Yeah I know python is all about being obnoxiously explicit, and that is one of the things that I really do not like about that language. Now that I think about it maybe Python should have been strongly typed, its "no implicit behavior" opinion works much better under such a regime

marmaduke · on Nov 14, 2020

Some of those things in Python reflect its history as a better language between Bash and C, and the gotchas of those languages became verboten in Python.

Elsewhere in eg NumPy conversions between number types are automatic as long as they don’t lose information.

stevesimmons · on Nov 14, 2020

Python is biased towards "explicit is better than implicit". It would rather fail with a TypeError than silently coerce types and risk masking logic/type errors.

> Why do I have to care about types in these dynamically typed languages?

Because Python is strongly typed, and types determined behaviour.

visarga · on Nov 14, 2020

> I never once recall an instant where I was saying "Uh uh, this type signature is driving me craaazy"

For example, this is ugly in Python:

def handler(on_error: Callable[[int, Exception], None]):

emptysea · on Nov 14, 2020

Alternatively a Protocol can be used instead of Callable

  from typing import Protocol

  class ErrorHandler(Protocol):
      def __call__(self, p0: int, p1: Exception) -> None: ...

  def handler(on_error: ErrorHandler): ...

valenterry · on Nov 14, 2020

It doesn't matter if you have the types ascribed explicitly or not. In the end, the developer needs to know the types in both cases anyways. Otherwise, if they don't know the types, they will pass in a callable that expects the wrong types (for example they mix up [int, Exception] to [Exception, int]) and now you just have a runtime error.

If anything, having type signatures like that are a good thing! If you think they are too complicated/ugly or driving you crazy, then you have an incentive to improve them. In many languages, callbacks are now considered bad style and that's good! Hiding the type signature does not make the problem go away, you just move it into the future where it will bite you even more.

Too · on Nov 14, 2020

What's the alternative?

No type signature? Or documentation in another file somewhere else guaranteed to be out of sync, difficult to find and not enforced by the compiler? No thanks to either of them.

mixmastamyk · on Nov 14, 2020

> This boggles the mind. I am using typed languages since I can think.

And therein lies the problem. It's a tradeoff: prototyping speed/readability vs longterm reliability. Modern languages have greatly improved the tradeoff, but it still exists, whether purists believe it or not.

I personally find the highest productivity in adding the majority of tests and typing later, after a design solidifies, not before.

finder83 · on Nov 13, 2020

The issue I have with API design goes much deeper...if you have a structured database, a statically typed language, and want validation, you're writing 3 separate schemas that all do slightly different but not really different things. Add protobufs onto that, you're now doing four.

Maintaining four different schemas does not make refactoring easy, in fact I'd even say it's harder than just using a nosql database and a dynamic language where all you're maintaining is the validation schema. (Yes, I realize there are other benefits to static typing.) Sure, it might generate the code initially depending on what you use, but you still have to maintain it. And any attempt to combine them has been way too complex/verbose. I want to write a lot less code, not more.

What I really want is a static language with a validation logic extensive enough that it can be used to generate CRUD APIs, validation, and still offer static typing. I don't know how this would work in practice, and I might hate it after I try it, but I'm tired of maintaining schemas.

Just to add to this, things like authorization, login, registration, etc, seem tiresome to write every time for a new API. I've been trying to find something more along the lines of a Backend as a Service, which Parse is very interesting for...but I wish there were more options and easier to use libraries for crafting APIs. It actually feels like we're stuck in the past no matter what we do. I welcome suggestions for easier to write APIs.

kentonv · on Nov 13, 2020

This is exactly why a lot of people choose to use the protobuf-generated data types as their data model internally within their app in addition to for serialization, and many people even derive their database schemas and validation from their protobuf schemas. (When I left Google in 2012, Spanner table schemas were literally defined using protobuf IDL. I haven't followed it since so I have no idea if that's still the case.)

Admittedly, protobuf wasn't explicitly designed for these purposes, it just of grew into them because they were convenient, and as a result there are certainly warts. But it's close enough that it works pretty well.

fizx · on Nov 14, 2020

You can take it all the way into the db with OSS if you like https://github.com/FoundationDB/fdb-record-layer/blob/master...

hedora · on Nov 14, 2020

The last time I checked, protobuf didn’t help with any sort of validation. In fact, the code I’ve had to maintain (written by others) is more verbose than a hand rolled, validating serializer would be in C++, even though it didn’t actually validate.

My only explanation for its success is that people often cargo-cult Google technology, and that it can be used for cross language communication by people that can’t hand-serialize a struct.

gravypod · on Nov 14, 2020

You can make use of features like annotations to magic away the validation of APIs [0].

> people that can’t hand-serialize a struct

If you have 1 device type and 1 software version then this might work. If you have multiple CPU architectures and multiple versions of your software this will likely not work.

[0] - https://github.com/envoyproxy/protoc-gen-validate

gravypod · on Nov 14, 2020

Very much this. We do that at my current job. We use protos to: 1. Define configuration options for all services/devices

2. Define data models

3. Persist them into our DB: https://github.com/CaperAi/pronto

4. Define our APIs

It's extremely convenient to have a single source of truth.

finder83 · on Nov 14, 2020

Fascinating, I didn't know it could be used for all of that. Last time I tried it it felt more like static typing than true validation (e.g. validating email addresses, date formats, dependent fields, etc) I'll take a look at it again, thanks.

p1necone · on Nov 14, 2020

I've had a lot of fun hacking together graph/tree types in protobuf.

bvrmn · on Nov 14, 2020

Sounds like a bad idea. If your storage follows HTTP API 1:1 then why you need backend at all? Take firebase all other db-as-service solutions. It's like developing without public interfaces. Implementation changes will break clients.

jayd16 · on Nov 14, 2020

> you're writing 3 separate schemas that all do slightly different but not really different things

This right here is the siren song of ORMs and many other code generation solutions. They are indeed different, with different concerns. Storage concerns and query concerns are different than your business/domain concepts and those are different than an external API. They're related and it would be nice if they were the same but fundamentally they are not.

danielbigham · on Nov 14, 2020

The project I'm working on does this -- we define high level semantic entity types, and from that you can:

- Define APIs that take those entities as inputs, automatically validating API arguments for you, automatically generating documentation, and driving API consistency.

- Can represent those entities in memory without writing any custom code beyond the schema.

- As mentioned above, you don't have to write manual code to validate your entities, that's done for you from the schema.

- Automatically generates database tables for those entity types without having to maintain a separate database schema, and can serialize/deserialize entities into or out of the database without having to write SQL. Acts like a document based database but still allows the power of relational SQL matching.

- From git diffs, automatically produce expressions that specify how the entity types have evolved over time, and then have the ability to apply those entity type diffs to a production server & database.

- Automatically maps natural language onto the schema allowing NLU queries like "orders today in Canada" to be turned into a semantic representation that the database can natively understand and execute, with control to override default NLU mappings if the default mapping isn't working right.

- Automatically generates web UIs for viewing and editing entities.

- Supports dynamic applicability to be defined so that whether a property is applicable, and which values are valid, can be defined as a dynamic function of other property values.

- Supports inferred properties to be defined in either an inference rule approach, or a functional approach.

kiwicopple · on Nov 14, 2020

In my previous startup I used my Postgres database to generate out JSON Schemas (detailed in [1]). It was great for validations. We even generated some of the UI from the Schemas.

> Just to add to this, things like authorization, login, registration, etc, seem tiresome to write every time for a new API

Check out Supabase [2], which is similar to Parse (disclosure: I'm a Supabase cofounder). We use Postgres as a backend and we hope to database as the source of truth (using Row Level Security + JSON Schema validation for the front end).

[1] https://paul.copplest.one/blog/nimbus-tech-2019-04.html#decl...

[2] https://supabase.io

finder83 · on Nov 14, 2020

This looks great, look forward to trying it out. Love the direction I'm seeing though

tylergetsay · on Nov 14, 2020

TypeScript + class-validators + reflect-metadata has let me build a GraphQL API w/ only one of these schemas. It's very intuitive/fast to develop for me.

ramraj07 · on Nov 14, 2020

What are your thoughts on fastapi? We end up having to define schemas only twice, one for sqlalchemy models and another for the version of that model you want to deliver via the api. The experience has been that the code is almost completely just schema definitions with very little boiler plate, and even the slight duplication between sqlalchemy and pydantic actually end up being more useful than not in every case.

finder83 · on Nov 14, 2020

I haven't given it a fair chance, I haven't done python in a few years at work, moving to typescript/node, Go, and Rust instead. I'll definitely check it out though. I was looking at it the other day and it reminded me a lot of flask or to a lesser degree Django with django rest framework, but I didn't make it far enough to understand what was going on with serialization.

I actually just learned about fastapi two days ago, I guess I'm a lot more disconnected from python than I thought I was.

Shorel · on Nov 14, 2020

A decade ago I wrote some Lisp code that generated C++ code with something similar.

It was completely inadequate, I would use SQLite now for the same. But it was very fun to write.

nostrademons · on Nov 13, 2020

The static/dynamic debate reoccurs every ~20 years and has more to do with the economics of the software industry than any intrinsic merits of each approach.

In times of rapid change, technological revolution, and landgrabs for market share, dynamic typing wins out. Why? Because you don't know what information you want to encode in your schema. You get this by putting products in front of users, changing them, and see what resonates and increases your growth rate. This was the case in the early 60s (mainframes & scientific computing), the early 80s (AI & PCs), and the early 00s (Web 2.0).

In times of steady-state, where the dominant players have already been established and we're just filling in niches, static interfaces dominate. This is a way of building supplier networks, increasing efficiency, and locking down best practices. At this point the users are largely known, we're just trying to serve them better. This was the case in the 70s (mainframes), 90s (Windows), and 10s (Internet/mobile).

The challenge for a new startup selling statically-typed APIs is that at this point, the ecosystem is already established, the big players are in place, and remaining startups are just filling in niches. Somebody is likely to win with a statically-typed API, but it'll likely be Google (gRPC), Facebook (Thrift), or Microsoft (TypeScript?) rather than some startup.

kentonv · on Nov 13, 2020

> Because you don't know what information you want to encode in your schema.

I'd argue that Protobuf actually embraces this -- its killer feature vs. other binary protocols of the era was that it's always easy to add new fields.

IshKebab · on Nov 14, 2020

Yes, but it's not dynamically typed at all. Dynamic typing is basically always the wrong decision in my experience. You give up a huge amount of basic error checking (e.g. 15% of JavaScript bugs are prevented by Typescript) just to avoid a little bit of typing. And it's not even like it saves you any time since with dynamic typing you spend so long trying to figure out what types functions expect and looking up APIs in documentation since code completion doesn't work.

Guthur · on Nov 14, 2020

Conflating binary and schema.

Also there are many over schema encodings that have better type systems than protobuf and allow dynamic schema, Avro being one for example.

ayewo · on Nov 14, 2020

Your first sentence is peak HN :)

On a serious note, kentonv properly prefaced his opinion with a qualifier “I'd argue that ...” but you somehow interpreted it as a fact ...

Anyway, if you think the person you replied to does not understand the difference between binary and schema, after helping author protocol buffers at Google, then no one can really claim to know anything here on HN.

Healthy skepticism is fine as it helps keep the discussion grounded, but its blind use will more often than not stifle insightful contributions from being posted by people close to the action.

Edited for clarity.

nostrademons · on Nov 14, 2020

Ignorance of who he's replying to aside, Guthur has an important technical point. Avro's separation of schema & binary is better than protobuf's field tags. You need a schema to meaningfully parse protobufs anyway (usually baked into the codegen), so you might as well make that schema a first-class object and package it with the data. Then you don't need to include or maintain the field tags (shrinking the binary size), you can explicitly reconcile different versions by sending along the new schema, and you don't need to codegen for each language, letting you easily support reflection and dynamic languages.

kentonv · on Nov 14, 2020

That approach -- sending the schema with the message and having the receiving end use it to reconcile version skew -- seems extremely expensive.

For it to be a win in terms of message size, the message has to be pretty large, or the schemas have to be negotiated at the start of a session that lasts a while. Maybe it would be a win for some use cases, but there's certainly plenty of use cases where the schema would end up much larger than the actual payload.

The bigger problem, though, is complexity. Code at the receiving end has to parse the schema, diff it against its own expectations, build the proper tables to translate the message content to the desired version, and then apply that.

I admit I haven't actually tried building something like this nor have I looked much at Avro's implementation, but this all sounds really slow and error-prone to me.

nostrademons · on Nov 16, 2020

> the message has to be pretty large, or the schemas have to be negotiated at the start of a session that lasts a while

Those are the two main use cases for a binary serialization format, though. Either you tend to have a very large row-oriented file with lots of homogenous records, say a SSTable or BigTable or Hadoop SequenceFile or Postgres binary column. Or you have a long-lasting RPC connection that serves one particular service interface, where you'd want to negotiate the protocol used by the service at the beginning and keep it up for hours.

I can think of a couple exceptions, like end-user HTTP connections to a public API. But for those, you usually need a complicated capability-based protocol negotiation anyway, because you need to handle hostile (hacked) clients or old versions that are stuck on something from 5 years ago. Google's habit of sticking a custom message type in a MessageSet and sending it along for a feature that triggers on 0.1% of traffic isn't really a thing outside of Google (and possibly other FANGs), not least because most companies can't afford to staff a team to maintain a feature used by 0.1% of traffic.

The solution for complexity is pretty routine: hide it behind a library. I'm not terribly fond of the particular API that Avro provides, but the wire format is sound & simple and there's nothing preventing someone from writing alternative library implementations.

Guthur · on Nov 14, 2020

This is my major criticism of protobuf and by extension grpc. IMO it emphasises two salient points, size on the wire and centralised control. The first is laudable but in my mind is taken too far because of the second. A decentralised distributed system requires ease of discovery and a decoupling of server and client.

I'm well aware of the many reasons to build distributed systems, not least of all the ability to distribute engineering effort, and so can see that if team distribution is a primary motivator for creating a micro service system that there would be a desire to make it appear like it's actually all one process. Of course it isn't one process but I can see the desire.

rubyn00bie · on Nov 13, 2020

They reference Protocol Buffers as a source/definition of a "schema" at the bottom and IMHO they're pretty rotten as schema definitions.

There are no meaningful validations or constraints, properties/members can be missing/omitted, cross-language support is pretty rough when you need more advanced features like annotations to fill-the-gap caused by crappy types.

Moving to protocol buffers is just pushing the problem out a while until we end up back here. Which is to say; a place where we just need to stop bike shedding how we define our fucking data. To quote the ineffable R. Sanchez:

"Sometimes I can't even tell you apart, because I don't go by height or age, I go by amount of pain in my ass."

I feel the same about all of these alternatives to HTTP APIs. I can only differentiate them by the amount of pain in my ass they cause; and on that front they're all the same.

com2kid · on Nov 13, 2020

> They reference Protocol Buffers as a source/definition of a "schema" at the bottom and IMHO they're pretty rotten as schema definitions.

A past team of mine went from "raw C structures over the wire consumed by 4 different implementations on 4 platforms" to protobufs and it was a huge improvement.

One schema definition shared by all the teams was an end to the nightmare of some developer putting a field in the wrong order and burning debugging time trying to figure what was going on. Or even better, that bug we hit in .NET that, even though we explicitly defined the struct, size, and ordering, the compile reordered one of our fields and we couldn't find a way around the bug so we had to make changes to the structure on all platforms so the bug didn't exhibit itself.

Or the great lengths we went through to avoid making any sort of breaking changes. We didn't originally pad our structures out more than an occasional handful of bytes (embedded, constrained storage and slow transfer speed over BTLE) so we quickly ran into all sorts of horrid issues.

Life was much better with protobufs.

Our main complaint was the lack of unsigned types. That sucked.

We didn't use any of the RPC stuff, we just used it for the schema and as a binary data format. Worked great, would recommend any day over raw C structs.

SV_BubbleTime · on Nov 14, 2020

Funny, I hated Protobufs in C. The generated code felt bloated and I was never at all happy with the memory management. We used NanoPB. This was on embedded so that might be a different perspective than yours.

com2kid · on Nov 15, 2020

We also used NanoPB, in both C and C++.

I am pretty sure we used preallocated buffers, so memory management wasn't an issue.

Being able to have a set of definitions that works on all major mobile platforms was all sorts of nice.

rubyn00bie · on Nov 14, 2020

Aye, I think (not that my opinion is worth a shit) y'all are using protobufs correctly assuming I'm grokking your use case correctly. I'm mostly focusing on discussions moving from JSON to protobufs... or essentially any problem space where serializing/de-serializing your data types isn't a ever going to be an issue (assuming one is not totally negligent).

Most of these articles focusing on protobufs as schema definitions are able to use gzipped JSON just fine. Their only reason to use protobufs is for schema definitions because they believe the "type system" will help them enforce constraints, validations, and or enable consistency across application boundaries.

The binary format, specification, and platform independence are completely irrelevant for these "schema" definition scenarios being brought up on HN constantly... and yet, they should be the things at the top of the list if you need protobufs and having a better more robust schema definition language should be damn near the bottom.

I think I just figured out a way to sum up my protobuf feelings (so, sorry for the late tldr):

Protobufs are a contract for serializing/deserializing data structures NOT enforcing validations/contraints of those data structures.

If you're using 'em as a serialization contract: fuck yeeeeeeaaaaah. If you're trying to use them to improve validations/constraints then: fuck naaaaaaah.

playing_colours · on Nov 14, 2020

They removed mandatory / optional in Protobuf version 3, and this rendered it useless for Confluent Kafka schema registry at our company.

I read the explanation for this change - to be more flexible about breaking changes, and while it may make sense for some cases, we could not rely on Protobuf 3 in event driven architecture with stricter requirements to data consistency. We went with Avro.

rubyn00bie · on Nov 14, 2020

I've been looking at Avro for a while; how are you all feeling about it? Any suggestions or gotcha from real world usage?

playing_colours · on Nov 16, 2020

So far so good. Avro is the longest supported serialisation format in Confluent Kafka, recently JSON and Protobuf were added. If you are on JVM stack, the drivers to work with schemas are well supported. We use Python, and Confluent driver is lagging behind if you want some advanced stuff like supporting Avro unions for multiple event types per topic approach [1] and missing auto-resolution for schemas for that scenario in Avro deserialiser. It is not difficult to implement by ourselves, but I would prefer not to do it.

[1] https://www.confluent.io/blog/multiple-event-types-in-the-sa...

quantified · on Nov 14, 2020

It's a vendor pitch, so take it all with a nice grain of salt.

Protocol buffers have large problems in their own way. Just because Google produced them does not mean they are the right choice for any broader adoption than they already have.

See https://news.ycombinator.com/item?id=18188519 for example, and note that the the discussion on personal vs technical aspects has already happened.

I agree with the idea of schemas for API definition. JSON and XML are more transport-level, lacking major semantics that must be enforced in the software, Therefore, schemas need to be expressed with language bindings.

Protocol buffers have good traction here because of the investment that Google has made in IDL with multi-language bindings. There are other serialization formats as well with many language bindings, but investment in the IDL needs to be made. The OP is drafting off of Google.

So, yes to the thesis and no to one of the conclusions.

Snitch-Thursday · on Nov 13, 2020

WSDL, JSON-WSP[1], now protobuf, and many more formats abound on the web.

In the spirit of 'choose the boring technology', which of these[2] is common enough / flexible enough to be worth learning to use and use well for non-fad web development / homestyle programming?

I'd default to WSDL since I happened to grow up learning that in school, but I'm sure that's my baby duck syndrome since CORBA etc came before that.

[1] https://en.wikipedia.org/wiki/JSON-WSP

[2] https://en.wikipedia.org/wiki/Interface_description_language

ako · on Nov 13, 2020

Maybe have a look at OData, strongly typed data model, standardized query language, selection of data required to optimize performance (similar to graphql), changelist support to enable caching, grouping to enable aggregating large amount of timeseries data, capabilities model so services can indicate what they do and do not support, and support for operations/actions, all conforming to REST.

techbio · on Nov 13, 2020

Thanks for "baby duck syndrome."

https://en.m.wikipedia.org/wiki/Imprinting_(psychology)

BaronSamedi · on Nov 14, 2020

My first thought when reading the article was, how about CORBA? The author makes no attempt to explain why CORBA and SOAP/WSDL failed so their recommendation has very little merit. Software technology historically follows a pattern: "worse is better". We might also call this pattern "commodification". HTTP/JSON has likely reached the lowest common denominator sweet spot to ensure its success and dominance.

qppo · on Nov 13, 2020

One day we're going to just return to c structs and live with it.

therealmarv · on Nov 13, 2020

wait... I also learned WSDL... is it still a thing? I thought this is SOAP related technology only...

foolfoolz · on Nov 14, 2020

when you use these open api generators or protobufs the transport is abstracted away. yes wsdl is soap and POST for everything, but when it’s abstracted away, does it really matter? the value is from not having to think about the transport layer at all

bww · on Nov 13, 2020

This article seems to view “APIs” as a monolithic concept wherein the trade-offs required are the same under all circumstances.

Particularly for public-facing APIs, unless you want to maintain a client library in every language your users could conceivably want — or make them shoulder the burden of dealing with the up-front costs of a far more complex integration that requires a bunch of tooling — plain old JSON over HTTP is going to be very difficult to beat for its familiarity and simplicity.

And that isn’t even to consider the various technical hurdles involved in using something like gRPC in a browser on the other side of a load balancer, for example.

mumblemumble · on Nov 13, 2020

Note that protocol buffers != gRPC. The reason you can't (directly) consume gRPC from a browser isn't because of protocol buffers, it's because there are no browser APIs to permit client code to work with HTTP2 streaming.

Clients consuming protocol buffers does require some tooling, but it's not that much more complicated than the tooling that is needed to consume JSON, which also requires a separate library for almost any language. Once you get past that, though, protocol buffers are relatively more language-friendly. The designers put a lot of thought into trying to ensure that it's difficult to design a protbuf-based format that can't be mapped onto the semantics of virtually any language. That is not the case with JSON, which, by being schema-on-read, makes it relatively easy to accidentally create an API that is fundamentally difficult to consume from many popular programming languages.

That isn't to say that it's perfect, but one thing I do like about the protobuf approach is that problems have a tendency to stay solved once you solve them. JSON-over-HTTP seems to always lead to this chronic low-grade messiness that seems minor at first, but the cost adds up over time.

rendall · on Nov 13, 2020

There is definitely need for better tools and discoverability for server-clent APIs. Protobufs seems like an odd take but I'm looking forward to their subsequent articles

The best in class right now is Open API 3, which offers a way to describe endpoints, verbs and expected responses with a JSON schema. And even so, working with it feels primitive compared to, for instance, front end focused tools with bundling, type checking, package management and so forth.

With Open API 3 / Swagger, you hand write a YAML or JSON file or fill out a form and have it done for you, but there is no code completion nevermind AST analysis. And now you have a schema, but there is no automatic link between that schema and the backend code, nor error typing, nor validation. The code generation options that are available are clunky and not customizable. If you change the code, you just have to go over the schema by hand again and make adjustments. Unit tests help of course, but you build it yourself

As for server responses, there aren't really coalescing best practices. There are many possibilities for incoming and outgoing headers, but no language or library to ensure getting them right, the way that TypeScript works for instance

I really feel like this is greenfield territory, but that's strange, since it's pretty critical

swsieber · on Nov 13, 2020

It definitely is greenfield territory

At work we have a java backend and a typescript frontend. I wrote some java code that traverses the api endpoints and generates typescript definitions for them. It's completely bespoke of course. I also had to write a small java compiler plugin to record the type parameters for method return values on our api endpoints.

nullsense · on Nov 13, 2020

In .NET Core there is a nuget package you can add that generates the open api docs based on your controllers. It's a good start, though to make it fully accurate you have to add the right attributes to each endpoint. It works pretty well.

What I find really irksome about the .NET openapi tooling is the client generators use RestSharp under the hood which doesn't support HttpClient. There's no way you shouldn't be using HttpClient in 2020. So companies leveraging openapi to generate their SDKs are generating sub-par code.

BigNick · on Nov 15, 2020

What's the problem with RestSharp?

Used it a while ago and it was pretty easy to use.

nullsense · on Nov 16, 2020

It doesn't use HttpClient under the hood. So, it can't make use of DelegatingHandlers. Polly is build on DelegatingHandlers. They're incredibly useful.

RestSharp is old. It's from a bygone era when making Http requests in .NET looked very different.

You'll find Refit or Flurl are also easy to use but are a significant upgrade due to being able to use HttpClient.

kronin · on Nov 13, 2020

You don't have to hand-write your openapi.yaml. we are using quarkus, and by annotating the endpoints you get a generated openapi.yaml.

We then use this in a typescript generator to generate client code.

treve · on Nov 14, 2020

I understand more than a few do this and love this. To me the schema & design should test the implementation. If you generate the schema based on the implementation, your schema is basically implicit.

I dont think by doing it this way you really get the benefit of the premise of this article. You get some other benefits, like API documenation and perhaps client code generation and that's perfectly ok if that's all you need.

However, if you believe in the idea of schema-driven, design-first api development, don't do this.

crucialfelix · on Nov 14, 2020

I think the parent comment doesn't mean that the schema is generated from the implementation, but that there are annotations colocated with the code that provide metadata to generate the schema.

This is what I do. I have typescript functions that generate openapi responses.

I do generate the schémas from sequelize models, blending in descriptions and enums where appropriate.

https://southpole.stoplight.io/docs/calculate/docs/CSV%20imp...

piaste · on Nov 14, 2020

We first write the OpenAPI spec in collaboration with non-developer stakeholders (product managers or whatever).

Then we write the code to implement the spec. (We tried using code generators for this step, but quickly dropped them. If your language and web frameworks are sufficiently concise, adapting the generated code to your needs is likely to be more work than just writing from scratch.)

Then we enable an automatic tool to automatically generate an OpenAPI spec from the code. We can then compare it to the original spec to quickly see if we missed something. It's not an automated diff, but it still saves a _ton_ of time.

Finally, we use code generators for the client code. Those actually work pretty well, unlike server-side code generators, probably because there's little room for customisation in making HTTP requests.

ak217 · on Nov 13, 2020

Agreed, I'm shocked there aren't any frameworks with a focus on integration between the database ORM/schema, authn/authz, I/O validation, the data model in the client bindings, etc. None of this stuff is new or controversial ever since OpenAPI 3 came out, but I'm not aware of anyone who has even tried.

rendall · on Nov 14, 2020

I'll ding one more thing about Open API 3.

While it is an open spec, the spec maintainer, Swagger, definitely keeps its fingerprint on it for business reasons. Among other issues, their validation tools in particular require a sign-up. Postman also has an Open API 3 schema builder, but also requires a sign up. There is no technical reason for this, and you have to trust Swagger or Postman to be responsible with your API schema. I believe this front-and-center branding of the spec itself makes it not as appealing or dynamic as it otherwise would be

edit ApiBldr seems ok: https://apibldr.com/

edit 2 AsyncAPI seems interesting: https://www.asyncapi.com/

wing328hk · on Nov 15, 2020

You may want to give https://apitools.dev/swagger-parser/online/ a try to validate OpenAPI (fka Swagger) files.

You can do it online without a sign up or use the API or CLI according to your preference.

lxe · on Nov 13, 2020

"Breaking changes" happen regardless whether you use protobufs or JSON. Where the schema contract is enforced, run-time or build-time is irrelevant.

- If I serve a JSON API and make a change, your client may stop working correctly if I broke the API contract.

- If I serve a protobuf API and make a change, your client may also stop working correctly for the same reasons.

kentonv · on Nov 13, 2020

With schema-driven protocols, it's possible for linters to automatically detect many kinds of breaking changes and flag them, to stop developers from creating a breaking change by accident.

Of course, it's still possible to make breaking changes without changing the schema. But being able to detect and prevent some kinds of bugs is better than not.

mixmastamyk · on Nov 13, 2020

Yes, the contract pushes the break forward, to happen at build time rather than waiting for run time.

EdwardDiego · on Nov 13, 2020

Yeah, more tooling is required. We use Schema Registry for our Kafka consumers and producers and it's quite handy, we tend to use it in FORWARD compatibility mode. We've also in the past used build time compatibility checks for the same purpose.

https://docs.confluent.io/current/build-applications.html

recursivedoubts · on Nov 13, 2020

I don't have strong feelings about this article except for the fact that it once again uses REST to mean "JSON over HTTP" when, in fact, REST requires HATEOAS and the vast majority of APIs that claim to be RESTful aren't, and almost certainly shouldn't be, since they aren't using a hypermedia/hypertext.

https://intercoolerjs.org/2016/01/18/rescuing-rest.html

https://intercoolerjs.org/2016/02/17/api-churn-vs-security.h...

The API community made a mistake in appropriating the terms and concepts from REST (a description of the original web architecture) and we've been unwinding that mistake for almost two decades now, but the momentum of language is amazing...

nicholasjarnold · on Nov 13, 2020

Yes, true pure technical REST probably does require HATEOAS. However, there's still a lot of value in a "RESTful" API that that achieves at least 'Level 2' in the so-called Richardson Maturity Model.

https://www.martinfowler.com/articles/richardsonMaturityMode...

On my teams I advocate for building Level 2 maturity "RESTfulness" into the design as it seems to strike a good balance between following the principal of least surprise for those who need to consume it and the heavier implementation effort required to get to that Level 3, which as you point out is probably not even appropriate in most cases.

solipsism · on Nov 14, 2020

Seriously, give it up. It's a losing battle. The term means something different from what you want to to mean. How many HN comments about it are we going to have to suffer through?

recursivedoubts · on Nov 15, 2020

At least one more!

gipp · on Nov 13, 2020

My sense is that the article would agree strongly with you here but didn't want to get into that long digression.

mixmastamyk · on Nov 14, 2020

What types of applications is HATEOAS appropriate for? A CMS perhaps?

hcarvalhoalves · on Nov 13, 2020

> In other words, the benefit of maintaining type signatures now well outweighs the cost.

That's a rather bold claim to begin your argument, but there are no references.

There's a good review of related studies here: https://danluu.com/empirical-pl/

mistersys · on Nov 13, 2020

It's fairly obvious that for the any public facing API, type signatures are going to be a net positive, because you have to write input validation for your API anyway to have any hope of it being secure.

You can either write as a series of buggy, hard to understand if statements or use a well-defined schema tool of some sort. Furthermore, once you have the schema you can generate client APIs in e.g Typescript for your APIs instead of wasting engineering effort on writing them manually.

In terms of static types generally, outside of APIs, the equation can be different. However, if you are creating a library that many developers will use, given the state of IDEs, your just hurting your users (developers) by not providing static types. I use Typescript in every single project not because of supposed benefit of "type grantees", but because I don't need to spend all day flipping between some developers poorly written, incomplete docs to understand how to use an API. My code editor will tell me the whole format and I can explore the APIs with goto definition, demonstrably reducing development time.

spinningslate · on Nov 13, 2020

Agreed. The unconditional nature of the statement detracts from its message. Had it said: "we see benefits in static types for APIs, and it's working for us - here's why" it would have had some credibility. And provided some insight on why it might work for others too.

Contrast with DHH's articulation on the corecursive podcast [0]. Paraphrasing:

* Dynamic types work for him * People don't all think the same. So static typing is fine too.

Jeremy Howard made essentially the same point on TWIML AI [1].

The point is there's far more to success than static vs dynamic typing. Personal preferences, team skills, culture, task at hand, nature of domain, commercial imperatives, code base size, and many other things besides all influence choice.

To suggest there's a single, global answer to static vs dynamic - whether for languages or APIs - is a misguided and unhelpful simplification.

[0]: https://corecursive.com/045-david-heinemeier-hansson-softwar...

[1]: https://twimlai.com/whats-next-for-fast-ai-w-jeremy-howard/

karmakaze · on Nov 13, 2020

I think a better way of phrasing this would be "the cost of maintaining type signatures has been reduced well below the benefits" since I believe buf makes tools to do just that.

kentonv · on Nov 13, 2020

I think it's both that the costs are going down, and that the benefits are going up, in the form of useful tooling that couldn't exist without schemas to drive it.

Quantifying these costs and benefits is, of course, notoriously difficult.

hinkley · on Nov 13, 2020

I think the biggest way API design is stuck in the past is predicated on the thinking that because the code is available, the degree of documentation we need is diminished.

This isn't a rant about documentation. It's about the cognitive dissonance caused by making the internals of our code for ourselves but then expecting all of the users to read - or more likely, step through - that code to figure out why it's behaving oddly with empty strings.

What I'd like to see us do is to make the shape and contents of the stack frames one of the criteria by which we judge the quality of our code. This also affects the quality of the error stack traces, but it's more of a 'pro' than a reason.

Most of the time I'm only reading your (general you, not just API designers) because I am having an issue. It's a basic psychological reaction that the people near a pleasant or unpleasant experience are partly blamed for that feeling. If I'm trying to be sympathetic to that, or my user, then I have to think about my code as something people mostly look at (and almost entirely remember) when they're already having a bad day.

Any work I do to honor that relationship has often paid dividends the next time I'm having a hard day, and especially when we're having a hard day together (eg, a production outage or a missed milestone)

rohan_ · on Nov 13, 2020

I think this only solves the problem for the API owner, not the consumer, right? You can't use cURL with protobufs like you can with REST to quickly play/interact with the data

barumi · on Nov 13, 2020

>>I think this only solves the problem for the API owner, not the consumer, right?

The only problem this solves is their need to sell their product.

This so-called "schema-driven development" approach was already tried in the past and failed miserably. Call it SOAP or OData, the world already gave that a try, saw all the mess and operational and productivity problems it creates, and in spite of the gold mine it represented for tooling vendors and consulting firms, it was a collosal mess.

It's very weird how their selling pitch is based on veiled insults on hoe "the industry is still twenty years behind", but they failed to do their homework to learn the absolute mess that their so-called "schema-driven development" approach left behind.

It's as if they are totally oblivious to why the whole world shifted to "free-form" APIs, which worked far better than the SOAP mess, and their are hell bent on betting on a rehash of the bad old days.

kentonv · on Nov 13, 2020

Basically everything inside Google has been built this way through their whole history. Are you calling them a failure?

barumi · on Nov 13, 2020

Just because a company managed to do stuff a certain way that does not mean that way is the right way or adequate or good.

Enough with this cargo cult bullshit approach to technical problems.

kentonv · on Nov 13, 2020

Your argument seems to be entirely ad hominem.

SOAP failed because it was a bad design (overcomplicated, verbose, XML-based), not because it used schemas. I've never heard anyone who uses Protobuf say it reminds them of SOAP.

musingsole · on Nov 13, 2020

Your previous post is an appeal to authority. You're both mired in fallacies.

cfors · on Nov 13, 2020

There is tooling built to easily do grpc requests from the command line[0] or even a GUI if that's your thing[1].

The tooling isn't quite as ubiquitous as cURL, but if anything I would think that its even easier because you have introspection into the contract with a .proto file.

[0] https://github.com/fullstorydev/grpcurl [1] https://github.com/uw-labs/bloomrpc

kentonv · on Nov 13, 2020

Exactly. JSON is benefitting from the fact that it can ride on top of existing non-dedicated tooling in an ad hoc way. Protobuf/gRPC needs dedicated tooling designed to replace what JSON can do with cURL. But that tooling has the potential to be much better than what can be accomplished with JSON, because it can actually type-check your input (e.g. catching typo'd field names) and maybe even let you explore the schema interactively. E.g. imagine having bash tab completion for these APIs -- easy if you have the schema.

ijidak · on Nov 13, 2020

I take issue with this statement in the article:

> the industry has learned over time that statically typed languages actually enable a whole host of new tooling possibilities, and ultimately, this tooling can drastically improve developer productivity and codebase maintainability

I would say, a new generation of developers have learned this.

But I don't think the industry as a whole ever thought dynamic everything was something that would scale long-term.

Dynamic languages are great for prototypes and tiny teams of 1 to 5 disciplined developers.

As soon as the LOC starts to climb, or you have one loosely disciplined developer (which describes 80% of developers on the planet), you're toast.

I never felt dynamic would become the central paradigm for most business software development.

Just something used by startup founders less than 5 years out of college, hobbyists, and for selective portions of large projects (like maybe plugins and extensions of a larger piece of software).

hashamali · on Nov 13, 2020

Surprised there is no mention of OpenAPI. A spec driven API development process solves most of the problems described without having to switch to schematized data serialization formats.

kentonv · on Nov 13, 2020

I admit I haven't written an OpenAPI schema myself. But, looking at the docs, I have a hard time imagining that I'd want to do API-driven development with it. Let me explain what I mean by that...

When I start a new project, or a major new feature in an existing project, usually the first thing I do is write out the Protobuf or Cap'n Proto schemas describing the APIs and data structures. Even for internal-facing APIs. APIs are how components talk to each other, and defining the APIs defines which components are responsible for what in a very clear way. Often I find writing out the APIs is actually the easiest, most precise way to communicate a design with my teammates; much better than prose in a design doc.

For this purpose, it's actually pretty important that the schema language be something that's not cumbersome to write. Protobuf and Cap'n Proto feature dedicated schema languages that feel like a programming language. It looks like OpenAPI has you writing schemas themselves in JSON or YAML. Ick.

And once I've written those schemas, I run the code generator and poof, I now have nice, type-safe classes in my programming language of choice implementing my schema. This lets me move a lot faster implementing, being able to use my IDE's auto-complete and whatnot. Does OpenAPI generate code for you? I have worked on projects that use it and didn't have any generated code, so I am assuming not...

I think this is what the article is trying to get at.

(Disclosure: I'm the author of Cap'n Proto; long ago I was the author of Protobuf v2; and I am a (small) investor in Buf because I like what they're doing, but I don't work with them.)

hashamali · on Nov 13, 2020

OpenAPI is just the format for describing APIs for various use cases such as documentation, runtime validation, and yes, even code generation. Depending in the language you use, the support for code generation may vary though.

I’m certainly not against the flow you’re describing. Aside from having to wrangle JSON/YAML instead of a more concise DSL, it’s not too different from what you’d do with OpenAPI.

ccleve · on Nov 13, 2020

Nobody writes OpenAPI directly. You use an API design tool for that.

There are two ways to design APIs: code-first and design-first.

With code-first you write your code and then automatically generate the OpenAPI spec from that. I do it in Java using JAX RS and it works fine. It is strongly-typed and has all of the same advantages you get by designing your protobuffers first.

The world is moving toward design-first, though. You use a tool to write your API and then generate code from that. It's a nice idea, but I'm not there yet because I don't like the tooling.

I expect that in a few years the tooling will be good and we'll be able to generate both APIs and database schemas in one place. Looking forward to it.

wing328hk · on Nov 15, 2020

> Does OpenAPI generate code for you? I have worked on projects that use it and didn't have any generated code, so I am assuming not...

Please give OpenAPI Generator [1] a try to generate clients, servers stubs, documentation, schemas (graphql, protobuf, etc) and more. It supports many programming languages and many companies are already using it [2]:

[1] https://github.com/OpenAPITools/openapi-generator

[2] https://openapi-generator.tech/users

(Disclosure: I'm the top contributor to OpenAPI Generator)

mayank · on Nov 13, 2020

> Does OpenAPI generate code for you?

An OpenAPI spec is the equivalent of your .proto files. You need codegen to go from your .proto files to native classes, and you need codegen to go from your OpenAPI spec to native.

> Protobuf and Cap'n Proto feature dedicated schema languages that feel like a programming language. It looks like OpenAPI has you writing schemas themselves in JSON or YAML. Ick.

Counterpoint: I can take an OpenAPI spec and parse/process/render/codegen it in every single language without a custom parser library.

Shorel · on Nov 14, 2020

Your counterpoint makes no sense...

And Protobuf has generators for more languages than OpenAPI/Swagger.

wing328hk · on Nov 15, 2020

FYI. OpenAPI Generator supports many programming languages and server-side frameworks: https://github.com/OpenAPITools/openapi-generator#overview

This is not to say it has more languages supported than Protobuf as I do not know exactly how many generators out there supporting Protobuf.

(OpenAPI Generator also comes with the generator/converter that converts OpenAPI spec documents into gRPC and protocol buffer schema files. Not sure if OpenAPI Generator is also counted as a generator for Protobuf)

erpellan · on Nov 13, 2020

One of my issues with schema languages is that they're not actually languages. Having used kotlin's typesafe builders a few times, other mechanisms feel hamstrung by comparison.

mtoddsmith · on Nov 13, 2020

With OpenAPI you can use an API editor

https://swagger.io/tools/swagger-editor/

Sure you're writing YAML/JSON but the editor is type checking in real-time and enforcing the schema.

hn_throwaway_99 · on Nov 13, 2020

This is exactly why I use GraphQL for my APIs:

1. All inputs and outputs are strongly typed, and the framework validates the types before my code touches it

2. Can easily generate Typescript types from my GraphQL typedefs.

3. Types can evolve easily to maintain backwards compatibility with existing clients withOUT requiring explicit versions (if so desired).

_dujt · on Nov 13, 2020

Not only is there benefit for the developer, but for the consumer as well. Having typed APIs allows IDE-like autocomplete when working with API tools like Insomnia.

An entire website dedicated to documentation for REST endpoints and their query parameters can be replaced with a simple readme instructing those to point their tool to the GQL client endpoint.

taeric · on Nov 14, 2020

Feels so strange to see JSON and REST starting to get more criticism for not being XML and WSDL. Sure, they don't say that directly, but it is hard for me to see it as anything else.

Now, I fully grant that WSDL really faltered over what feels like a proxy battle between Sun and Microsoft now. As I recall, you could make a WSDL that was easy to use from .NET, or from Java; but not both. I don't recall what the other languages were like.

Which is all to say I don't really disagree. But it is eye opening to think that this is all getting trodden again.

deathanatos · on Nov 14, 2020

Nothing about REST prevents strong validation of inputs. If anything, the notion that the ends should pass data using well-defined media types should push one towards something that can be validated. I don't think "everything chucks application/json around and we have no idea what the JSON represents is in the spirit of REST at all.

The set of externally standardized media types is somewhat small. When I worked in real estate, there's not really a standardized set of stuff for that problem domain, and we're pretty much defining our own. And, worse, IMO, we did so badly, because we didn't take the time to lay out a good, well documented specification. Some of the schema languages out there can help with that. (Though I don't think they're a panacea.)

Likewise, while I agree with some of the article, protobufs are just not terribly great. The lack of sum types really hampers how you can represent stuff in them. (We had other, more practical issues when we tried to adopt them over JSON in my last org.) But, also, if you want to define application/vnd.foo.bar.baz as "a protocol buf with this definition that contains data representing this" … that can still be as RESTful as JSON.

(And, for the XML comparison… JSON is too verbose. XML is just even worse. JSON was a step in the right direction.)

taeric · on Nov 14, 2020

I don't really disagree. But I have yet to see a schema system for JSON that isn't just a crappy version of what XML can do. Similarly, most ways of representing services turn into bike shedding on how to map to HTTP.

I will also add that as soon as you think you need complicated models at the API level, you are almost certainly setting a path to problems. Yes, it is possible to come up with a place where sum types will make sense. In most cases, though, you are better representing a flat set of data, and let a validation layer move things into distinct types, if you really need it.

I view this as everyone wanting a programming language where you control AST constructs. Sounds great, but working with raw text/data at a layer above will almost certainly be easier. And you can use a compiler/evaluator to get to things you can manipulate at that layer. (Similarly, this idea that files are the bane of computer. In stark contrast to my always just wanting a file that represents whatever I am wanting to move from one place to another. However I want to make that happen.)

plutonorm · on Nov 13, 2020

Show me studies where static typing is shown to be superior to any of the other options. I'm not saying it's bad, but I'm irritated by the tone which assumes it knows things which it does not.

kentonv · on Nov 13, 2020

TypeScript is a thing which exists for the sole purpose of adding type checking to JavaScript, and nothing more. In particular, unlike most historic statically-typed languages, TypeScript gains no performance advantage from types; it literally compiles down to JavaScript before running. The only possible advantage is developer productivity stemming from type checking and other tooling it enables.

TypeScript is extremely popular.

Obviously there is some subjectivity here. Different people can have different opinions. Different kinds of projects may call for different approaches. But it's hard to argue that all those developers using TypeScript chose to do so without getting any benefit.

superfamicom · on Nov 13, 2020

I've personally found JSDoc gets me what I want out of TypeScript without any compiling or features that don't actually exists in the language (I'm glaring at you enums). Ensuring the documentation is correct ensures the types are correct. You loose some nice-to-haves but to me not having to compile my code or deal with that variety of tooling is a huge plus. I can facilitate folks who like TypeScript by exporting types from the same JSDocs.

SkyPuncher · on Nov 13, 2020

Have you ever done a wide refactor of a type or public method?

It's bliss in TypeScript. Nearly impossible in JS.

----

To me the value prop of Typescript isn't the initial writing, it's the maintainability. Having good types makes it significantly easier to maintain and refactor.

superfamicom · on Nov 13, 2020

I have, I didn’t have a bad time, but YMMV.

awinter-py · on Nov 13, 2020

oof -- I assumed 'schema-driven' meant database schema driven until I read further down

tools to automatically generate crud RPC from DB models + roles, like hasura and postgraphile, are smart and as they standardize will be money savers

protobuf is okay I guess; last time I touched this 12 months ago the mobile platforms didn't have sufficiently good build + typing tools to take advantage

eikenberry · on Nov 13, 2020

IMO talking about API design as the focus is stuck in the past. If the last 20 years has taught us nothing else it has taught us that protocols won the whole http://wiki.c2.com/?ApiVsProtocol debate. No one can argue the complete dominance of HTTP as the dominant RPC mechanism in use today. We should be talking about how we can structure new, higher level protocols on it or create new protocols to improve on it. Not how to replicate all the previous mistakes on HTTP instead of TCP.

J-Kuhn · on Nov 13, 2020

No comment that says something like "Use SOAP instead"?

jugg1es · on Nov 13, 2020

The whole time I read this article, I was thinking "well, there's SOAP...". Was very surprised when I realized he was talking about something like protocol buffer. SOAP is a great way to solve the problem described in this article, but good luck creating a SOAP API in node. "What is old is new again"?

titanomachy · on Nov 13, 2020

SOAP has a high serialization overhead and doesn't support schema evolution.

jugg1es · on Nov 13, 2020

Oh, it definitely has a lot of overhead but it solves the primary problem as described in this article.

titanomachy · on Nov 14, 2020

Sure, it's a serialization standard that uses schemas. In that sense we are definitely coming around in a circle.

tflinton · on Nov 14, 2020

F$$$$ it, everyone just directly connect to everyones database and use SQL.

Thetawaves · on Nov 14, 2020

My experience with SOAP and wizzles indicates that one of the primary problems with statically typed protocols is the proliferation of types. Corresponding types need to be created in each language, and these object representations reflect the original implementation language (server), rather than whatever idiosyncrasies fit the current language (client).

The core benefit the world gained by migrating to REST/JSON was the total simplification of types, and the ease of interacting with them in any language.

TeeWEE · on Nov 13, 2020

We are using GRPC and for Rest OpenAPI and write our own codegenerators for OpenAPI...

How is this in the past?

I guess most developers don't use OpenAPI and just return JSON... Thats indeed an old way of development.

losvedir · on Nov 13, 2020

As a fan of static typing, the ideas in the blog post do resonate with me.

However, one concern I have with this approach over simple JSON, is client size. A JSON API can be consumed by the web client natively, basically. Whereas using, e.g., protobuf will require some tooling to digest a .proto file to generate some client side code for you to use. In practice, if you're embedding that in your web app, how large will that be?

rpdillon · on Nov 14, 2020

I've hit this in scripting. I wanted to submit performance metrics from a build script to an analytics service another team was running, but the service only accepted gRPC. This meant the bash script had to somehow pull in the entire protobuf/gRPC stack so it could say "The build on this host for this branch took 12 minutes".

I think accessibility from a broad range of environments is a disadvantage of the "protobufs everywhere" approach. I'm one of the few that still believes in dynamic typing though, so I'm probably biased.

seanhandley · on Nov 14, 2020

I agree there are many positive aspects to schema-based API systems.

What I disagree with is the assertion that "X is the future, Y is the past".

Novelty does not automatically equal superiority and there are tangible pros and cons for each.

I might be able to take the discussion slightly more seriously if the article wasn't part of a marketing strategy for a company selling protobuf-based products.

vmception · on Nov 13, 2020

Step right up, place your bets

"Will schema defined apis become a thing in 2021 or continue to stay exclusively in the realm of 'thought leadership' aka blogs"

Razengan · on Nov 14, 2020

I think most Swift APIs, especially SwiftUI and Combine, are pretty impressive, clean and modern.