Typing Is Hard

ekidd · on Aug 9, 2020

Pretty much any language supporting something like C++'s "template metaprogramming" will allow the programmer to perform arbitrary computation at compile time. This is, of course, "undecidable", because it's Turing complete.

This is not necessarily the end of the world. If you go out of your way to write an actual compile-time program by abusing your type system, and that program contains an endless loop, you probably shouldn't be surprised that your compiler hangs.

Unsound type systems cause more headaches in practice. For example, TypeScript types are only trustworthy if your program is careful about boundaries with the outside world. In larger, older systems, I occasionally do get bit by this.

guerrilla · on Aug 9, 2020

> Pretty much any language supporting something like C++'s "template metaprogramming" will allow the programmer to perform arbitrary computation at compile time.

It depends on what you mean by "something like" because Idris' dependent types are arguably equally as useful in practice but type checking remains decidable due to totality. This goes for Agda, Coq and Epigram (if it still exists?) as well for different technical reasons each, not that they are as convenient to program with.

bmc7505 · on Aug 9, 2020

Just because the current implementation of the Idris type checker always terminates due to the size change termination heuristic does not mean that type checking Idris is decidable. If the type system as specified allows an author to express undecidable statements, it should not be considered decidable, regardless of whether one type checking implementation terminates on all inputs. If some type checker does not implement its type system, this is a property of the implementation, not the language. Otherwise any undecidable type system could easily call itself decidable by checking only a subset of the specification (in fact, this is exactly how most checkers for undecidable type systems work anyway). If the type system for a language is undecidable, type checking that language is undecidable.

Kutta · on Aug 9, 2020

Incorrect. Whether the language as a logic is capable of expressing undecidable statements, is orthogonal to whether type checking is decidable. Type checking is analogous to checking proof validity in logics. Commonly used proof systems for first-order or higher-orders logics admit decidable proof validity, and already many first-order theories allow expressing undecidable statements. ZFC is a rather obvious example. Likewise intensional type theory is highly expressive as a logic and admits decidable proof validity.

Decidability of type checking is instead related to how explicit surface syntax is. In intensional type theory, the usage of equality proofs is marked explicitly, which is sufficient to retain decidable type checking. In contrast, in extensional type theory, the usage of equality proofs is left implicit, which causes type checking to be undecidable.

johncolanduoni · on Aug 9, 2020

This is not true, as MLTT-style type theories have a different syntactical notion of proof than first order logical theories like ZFC. Proof of termination of type checking for Idris is equivalent to proof that all terms can be reduced to a normal form: for any given term, you can construct another term which when type checked requires fully normalizing the first term. It is well known that proving that the normal form evaluation of all terms in a MLTT type theory terminates is equivalent to the system’s consistency (because the false type has no constructors). This is how proofs of relative consistency of these type theories are usually proved.

Kutta · on Aug 9, 2020

I don't see that "normalization implies consistency", which I'm aware of, relates to my previous comment in any relevant way.

ZFC and MLTT do not differ in that decidability of proof validity is not related to logical expressiveness.

It's not even true that for type theories, decidability of proof validity implies normalization. Normalization is not logically equivalent to decidable type checking.

For example, we can have a term language for extensional type theory which is annotated with reduction traces. This is the kind of syntax that we get when we embed ETT inside ITT. It has decidable type checking, as the type checker does not have to perform reduction, it only has to follow traces. This kind of tracing is actually used in the GHC Haskell compiler to some extent. So we have decidable type checking for a surface syntax of a type theory, for a type theory which is a) consistent b) not strongly normalizing.

johncolanduoni · on Aug 9, 2020

You’ve actually hit on the difference here: in the type theory the witness term plus the reduction trace is equivalent to the ZFC proof. It’s true that checking this combination is decidable, regardless of consistency. The intentionality does not eliminate the limitations of this property, it only provides that the reduction trace is implicit given the witness term. The actual proofs of decidability of type checking (e.g. from Martin-Löf‘s original paper) are conditioned on the normalization theorem, which assumes consistency of the type theory.

Let’s come at this from the other direction. Suppose I find a non-normalizing term X of type false (like that produced by Girard’s paradox if we leave off stratification). What happens when I try to type check this?

id(X): ==(X, rec_0(0, X))

Where id is the constructor of the propositional equality type and rec_0 is the recursor for False.

Kutta · on Aug 9, 2020

What you say is kind of interesting but I get the impression that we are talking past each other.

id(X): ==(X, rec_0(0, X))

Sorry, what's this supposed mean, is it a definition? The propositional equality type has two or three arguments, I only see one (X) here, if as you say "id" is propositional equality.

In any case, nothing peculiar happens when we check a trace-annotated proof of bottom. We are free to use any particular finite-step unfolding of the non-normalizing proof.

We also don't need any paradox or inconsistency for a non-normalizing term, it is enough to work in ETT in a typing context which implies bottom, or which contains an undecidable equational theory, e.g. the rules of the SK-calculus.

> The actual proofs of decidability of type checking (e.g. from Martin-Löf‘s original paper) are conditioned on the normalization theorem

Decidability of type checking is always relative to a particular surface/raw syntax. A trace-annotated raw syntax does not need a normalization assumption for type checking.

johncolanduoni · on Aug 10, 2020

I left the first argument of the equality type implicit, written fully it’s ==(0, X, rec_0(0, X)) since those terms are both of type 0.

The overall expression is shorthand for any term whose typechecking involves that type assertion. For example, type checking passing the term to the left of the colon to an identity lambda with argument type of the term to the right of the colon (λ(x: ==(0, X, rec_0(0, X))).x)(id(X)) will pass iff the left term’s type is right term.

I’m not talking about extensional type theories here (as Idris/MLTT is not extensional), so I don’t think I understand the relevance of the trace annotated proof. Idris will let you write the term I have above without any trace annotations so its typechecker must do something with it.

You’re right that a trace-annotated raw syntax doesn’t need normalization assumptions, but Idris is not trace annotated; the type checking algorithm is required to perform any normalization required without further direction from the user.

bmc7505 · on Aug 9, 2020

I think you might be confusing the language and the type system. Not saying anything about the runtime characteristics of the language (which is obviously Turing Complete), but the type system itself, which is TC. If the type system allows you to encode a TM, then it is undecidable, regardless of whether it terminates due to an incomplete type checker. Since the Idris type system allows you to encode a TM, then the type system is undecidable due to the Halting problem.

Kutta · on Aug 9, 2020

A type system cannot be TC. What you seem to talk about, is that if a type checker can simulate arbitrary TM's through an encoding of its input, then the type checker is necessarily non-total. This is correct. But the Idris checker is total and it is not possible to use it to simulate TMs.

The Idris type system does allow you to specify and formalize TMs internally, but that has no bearing on decidability of type checking, as I explained before.

l- · on Aug 9, 2020

A type system can be TC. A type system is a logical system comprising a set of rules that assigns a property called a type to the various constructs of a computer program, such as variables, expressions, functions or modules. If those rules can encode a Turing complete system, then the system is Turing complete. Type checking is the process of verifying and enforcing the constraints of types. Type assignment is different from verification.

guerrilla · on Aug 9, 2020

This is right. I was comparing Idris (the only existing implementation rather than its core type theory) to C++ (which has many implementations.) I think the situation may have be similar for Epigram as its core theory was very expressive but eliminators were only elaborated for strictly positive types or something like that (I might be wrong about this.)

My point was just that the situation with C++ is worse than it has to be.

intortus · on Aug 9, 2020

Sure, you don't have to go out of your way to challenge your type system, but it still means that you're a single typo away from sending your compiler and yourself on a wild goose chase.

jcelerier · on Aug 9, 2020

> Sure, you don't have to go out of your way to challenge your type system, but it still means that you're a single typo away from sending your compiler and yourself on a wild goose chase.

the alternative would be writing bash or python or ... scripts to generate the code, and for those :

- you likely don't have any error message at all (ugh)

- they won't have knowledge of the type system of the language you're operating.

Seriously, everyone complains about pages and pages of template errors, but if you use any decent IDE it's a two-click process to go to the part of your code where things fail. Errors will be split, then (at least with GCC & clang) you just start at the bottom of the error which is the place in your code where things go wrong 99% of the time - and if not, rarely if ever more than a couple steps above that. see e.g. : https://streamable.com/0k6lvl

intortus · on Aug 9, 2020

There are mainstream languages with decidable static type systems. They compile quickly and give obvious short error messages.

jcelerier · on Aug 9, 2020

names ! we want names !

also, decidable likely implies total - and my experience with total languages is that more often than not you need to add another turing-complete "scripting" layer on top of it (for instance, a TC language to generate expressions in the total language) as it's so cumbersome to write otherwise when you have actually business rules and not simple math examples

dleslie · on Aug 9, 2020

Stage early and stage often; if a single-but-syntactically-valid typo will break your build then you should only be a commit away from something that works.

riwsky · on Aug 9, 2020

Correction: Java’s type system has been unsound since version 5; version 8 (which TFA describes as the start of the issue) just happened to be the latest version at the time it was discovered.

ThreeFx · on Aug 10, 2020

Thanks, fixed!

kps · on Aug 9, 2020

If you have some form of Turing-complete macros, then macro processing might not terminate. I don't see why this is anything but trivial, or a compelling argument against Turing-complete macros.

jolux · on Aug 9, 2020

This page isn't arguing against anything though.

tlb · on Aug 9, 2020

I don't think it's a compelling argument against having such a language. But you might want to know what you're up against if, say, you decide to write an autocompleter for your editor that needs to know types.

gpm · on Aug 9, 2020

If you're writing an autocompleter, you have the bigger problem that it has to function when the program isn't valid syntax, or is valid syntax but has just plain contradictory type information and doesn't typecheck. Non-terminating typechecking seems easy to handle in comparison.

minerjoe · on Aug 9, 2020

In lisp and in other languages you just ask the running process.

greggman3 · on Aug 9, 2020

Because turing complete macros are super powerful. See lisp

orthoxerox · on Aug 9, 2020

A Turing-incomplete macro system will always terminate, but it may terminate in 2 years.

curryhoward · on Aug 9, 2020

That's true in practice, but technically speaking it's false. You could imagine a contrived system which is not Turing complete but still contains a (rather useless) primitive that causes an infinite loop.

leni536 · on Aug 9, 2020

Non-Turing complete doesn't mean that you can't create too high complexity algorithms out of O(1) primitives.

curryhoward · on Aug 9, 2020

Did you mean to reply to a different comment? You seem to be restating the correct part of orthoxerox's comment. I was merely refuting the incorrect part.

joshuamorton · on Aug 9, 2020

Arguably that would still terminate with an error message.

TheCoelacanth · on Aug 10, 2020

I don't think a Turing-incomplete language implies that programs always terminate. As a counter-example, imagine an obviously Turing-incomplete language that contains only a single instruction which simply does nothing in an infinite loop.

mathgladiator · on Aug 9, 2020

This is great, and I'm developing a static type system for my programming language (for board games), and I've messed up the type system so many times.

The challenge beyond the math is then making the error messages useful, and also being able to make progress beyond the first error in a useful way.

Roboprog · on Aug 9, 2020

Reminder: strict type checking is more trouble than it’s worth for some of us.

Ask yourself why dynamic type languages keep appearing, despite a fanatical resolve to purge them from the face of the earth? It’s almost as if some people prefer them.

But I guess the knowledge of salvation by compile time binding must be brought to the late binding heathens, to use a poor analogy.

On the other hand, I’m willing to admit that strict Compile time type checking is what many or most people prefer.

lostmyoldone · on Aug 10, 2020

I generally agree, except if I don't. Whether or not that is the case, I'm entirely uncertain about!

Generalizing to transportation one might ask: What is the better transportation device, a boat or a horse?

... of which one can somewhat reasonably conclude, that whether the question itself is the right question, is the real question.

Few oppose types if they don't have to do anything at all and if it doesn't take any of their time either in runtime, compile time, or otherwise. Realities being what they are, there is always at least some parts of the aforementioned that is false for any type system that is actually useful.

Because of this, and for a few other similar characteristics of types and computation, it boils down to nothing but a question of which kind of tradeoffs people find palatable.

Given that premise, there should be no surprise that the answers are subjective no matter the responder. The question of better is here nothing but an entirely subjective question veiled as an ostensibly objective one.

Roboprog · on Aug 12, 2020

One thing I don’t miss is C style Wild West access.

E.g., My coworker grabbed a word from somewhere, slapped a cast on it and proceeded to write into it. Hilarity ensues.

All the burden of explicit compile time checking, but no guarantees that anything is what you said it is by the time your coworker uses it. Of course, no computer Run-time is wasted on silly things like type, bounds or null checking, either.

Gehinnn · on Aug 9, 2020

Wait, C#s type system is turing complete?? Does somebody have an example type that the compiler cannot check?

namelosw · on Aug 9, 2020

Me too wondering if anyone can show us a proof.

opus132 · on Aug 9, 2020

> There exist many type inference algorithms, the best known one is the so-called Algorithm W.

Is this correct? I dug out Milner's paper [1] where he states that Algorithm J is more efficient (which was what I had been led to believe), but that Algorithm W is more suited to functional (as opposed to imperative) implementations.

Edit: Actually I think I'm parsing this sentence incorrectly, "best known" means literally that as opposed to "most efficient known".

[1] https://homepages.inf.ed.ac.uk/wadler/papers/papers-we-love/...

ThreeFx · on Aug 10, 2020

Best known is indeed meant as "most widely known" (although that may be true for only my little bubble). I think it's taught in most type theory / formal methods courses though. If anybody has facts for/against this claim I'd be happy to update the page.

kevincox · on Aug 10, 2020

It might make sense to update to "most widely known" to avoid that source of ambiguity. You can take that sentence as ((best known) algorithm) or (best (known algorithm))

opus132 · on Aug 10, 2020

Yeah I think you're definitely right, and Algorithm W is how HM was introduced to me when I studied it.

kevincox's comment sums up my misunderstanding.

clashmeifyoucan · on Aug 9, 2020

Ah, I recently started using type hints in Python. Would've loved to see how that compares even though it's not inherently a statically typed language!

singhrac · on Aug 9, 2020

Python's type system is an instance of "gradual typing" (like e.g. TypeScript). PEP 483 is actually a nice introduction to this: https://www.python.org/dev/peps/pep-0483/

clashmeifyoucan · on Aug 9, 2020

Right, it should still be possible to statically type it for the purpose of testing. Wouldn't really be close to practical usage but I'd still be interested in seeing how it compares.

Lammy · on Aug 9, 2020

The headline made me hope for an article about ergonomics.

riwsky · on Aug 9, 2020

People keep arguing for array covariance as if soundness and developer ergonomics were in conflict, but upgrading your type system from a laptop keyboard to some infamously-sound Cherry MX blues shows you indeed can have both.

With the amount of high-quality keyboard manufacturers these days, though, and the support for variance in preferred type parameters—do you need backlighting? Mac keys?—I do not blame people for asserting the problem of picking one is undecidable.

kjeetgill · on Aug 9, 2020

Ugh, I hate to do this, but GPT-3? Am I being paranoid?

Edit: Nevermind, I think I just missed the joke maybe

rrajasek · on Aug 9, 2020

Heh, it accidentally ended up type punning.

Roboprog · on Aug 9, 2020

I was expecting something about carpal tunnel syndrome or Dvorak keyboards as well.

Then I realized it was Haskell zealotry :-(

Iceland_jack · on Aug 10, 2020

I wish

enricozb · on Aug 9, 2020

What a cool resource. I've been meaning to finish an MVP for a programming language I've been trying to write, and it's pretty encouraging how many languages use undecidable type systems.

dnautics · on Aug 9, 2020

Is the comment about zig correct about its typing? It's certainly true about its compilation, but that's not the same thing. Also, if you're going to have a turing-complete compile time, you might as well have it in something that looks like a programming language, since that will fit your debugging mental model.

ThreeFx · on Aug 10, 2020

Since you can pass types to compile-time functions, yes that is correct. Take the example [from the 0.6.0 documentation](https://ziglang.org/documentation/0.6.0/#comptime) for example: At compile time the compiler has to figure out whether to type the `max` function using `u32` or `f64`. Since you can make the boolean condition arbitrarily complex the _type_ of max function called is undecidable in general.

I'm not arguing for or against undecidability in type systems, the page aims to be present the information neutrally.

dnautics · on Aug 10, 2020

Sorry, I wasn't defensive about the decidability of the type system, just curious about the analysis, which squares with my understanding of zig now! Thanks for the clarification; it might be helpful to expand on that in the doc.

MaxBarraclough · on Aug 9, 2020

Great list!

Can anyone comment on Ada's type system?

xiaodai · on Aug 9, 2020

What Julia?

choeger · on Aug 9, 2020

Julia has no (what some call "static" but that's actually the only form that matters) type system. It also has lisp-like macros.

remexre · on Aug 9, 2020

Having no type system (e.g. Forth) is different from having dynamic checking for your types (e.g. JavaScript, Python, Julia iirc), though.

Julia also has Common Lisp-like multimethods, which are pretty nice, especially for linear algebra sorts of stuff, where you might be multiplying a scalar by a tensor, or a tensor by a tensor, and don't want to have to specify different syntax for the two operations.

throwaway894345 · on Aug 9, 2020

I don’t see how JS or Python (not so sure about Julia) meaningfully have “dynamic checking for types”. If I have a function ‘def foo(a): return a.startswith(“foo”)’ and I pass an int in, Python doesn’t insert something like ‘if not isinstance(a, str): raise TypeError(“...”)’, partly because Python doesn’t even know that a str is expected; only that the param has a method called startswith(). And fair enough, maybe the type isn’t str in a statically typed language, but a Go-like structurally typed interface with a startswith() method; however, the function also isn’t inserting any meaningful check until the last possible moment when there is nothing left to do but blow up anyway, so I don’t see how this can meaningfully be called “dynamic type checking” when the types are all implicit and maximally abstract and the “check” is only dynamic dispatch. Pretty sure this all applies equally to JS except that a failed dynamic dispatch doesn’t raise an AttributeError but “undefined is not a function” IIRC. Not sure about Julia behavior, but I would guess that it also isn’t going to be sprinkling dynamic checks all over given its aspirations toward performance.

remexre · on Aug 9, 2020

> I don’t see how JS or Python (not so sure about Julia) meaningfully have “dynamic checking for types”.

What would you describe as dynamic typechecking then? Would Scheme be a language that would fall under your definition, since it has much less dynamic dispatch?

> isn’t inserting any meaningful check until the last possible moment when there is nothing left to do but blow up anyway

I mean, inserting it earlier would require static knowledge of types, no?

> Not sure about Julia behavior, but I would guess that it also isn’t going to be sprinkling dynamic checks all over given its aspirations toward performance.

Haven't used it since I took a linalg class in undergrad, but my understanding was that Julia does similar dynamic dispatch, but it's using a JIT, and does type propagation. As a result, once operand types are known, the compiler is free to monomorphize and inline.

A coworker wrote a thing to do this for Common Lisp, too (for some numerical code); if you're willing to give up a bit of dynamism, you can get quite a bit of performance out of this.

throwaway894345 · on Aug 9, 2020

> What would you describe as dynamic typechecking then? Would Scheme be a language that would fall under your definition, since it has much less dynamic dispatch?

I’m not sure about Scheme because I’m not very familiar, but one possibility is simply that “dynamic type checking” is meaningless. In other words, one could argue that anything that blows up at runtime is “runtime type checked” but that doesn’t seem meaningful. The only meaningful definition of “runtime type checking” that I can think of is explicit comparisons between a type annotation and a value’s runtime type information, for example, given this Python:

    def foo(a: str) -> bool:
        return a.startswith(“foo”)

I would argue that this would be “type-checked at runtime” if the interpreter implicitly added this snippet at the start of the function:

    if not isinstance(a, str):
        raise TypeError(“‘a’ is not a str”)

There are other possible notions, including a program which type checks each of its functions as their definitions are evaluated by the interpreter—I would call this “just-in-time type checking” and consider it a type of runtime type checking.

But these are different IMO than just blowing up at the last possible moment because there is no reasonable way to continue the program.

> I mean, inserting it earlier would require static knowledge of types, no?

First of all, even if it did require knowledge of static types, it could still be checked at runtime and thus runtime type checked. Secondly, many types aren’t known statically—consider types that are programmatically generated, such as are common when using SQLAlchemy, boto, etc. The interpreter could still type check these at runtime via either the JIT type-checking method described above or the flat-footed insertion of an isinstance() check at the start of the function (but I would argue that the later the type checking happens the less useful and meaningful it becomes).

> Haven't used it since I took a linalg class in undergrad, but my understanding was that Julia does similar dynamic dispatch, but it's using a JIT, and does type propagation. As a result, once operand types are known, the compiler is free to monomorphize and inline.

This sounds like the JIT type checking method that I described above, which is a concrete example of a meaningful runtime type checking example (as opposed to the notion that Python/JS does type checking at runtime which is either meaningless or incorrect IMO).

cben · on Aug 16, 2020

I agree with everything you say, except I have a lower bar of "meaningful".

> one possibility is simply that “dynamic type checking” is meaningless.

If I take your quotation out of context, this was the assumption of some people who separated languages into simply "strongly" vs "weakly" typed, by which they meant "checked by compiler" vs "not checked". It's incorrect though, and is the reason for separate "static" vs "dynamic" distinction.

C is a famous example of static (checked at compile time) but weak (≈ unsound) typing — compiler will happily accept programs that corrupt memory in all kinds of ways. mutating "const" variable, freeing used memory, crashing, remote code execution, and more... I don't think there are any invariants that a C compiler can enforce.

Scheme, Python, Lua, Javascript etc. OTOH, have no static (= compile-time) type checking, yet they maintain some invariants! An object that has pointers to it will not be freed; An object's type is known, and will not change (well, some class transmutation is allowed but some not, an int will not turn into an array); Some types are immutable; You can never divide a string / string! etc...

Moreover, by maintaining run-time metadata about object's types, they can tell you specifically that an operation raised a TypeError. Thus these languages are strongly typed at run time.

IOW, I'm arguing that anything that blows up at run time is checked IFF it tells you it was a type error. This is meaningful compared to C segfaults that tell you nothing :-)

---

The Java paper https://dl.acm.org/doi/pdf/10.1145/2983990.2984004 shows an interesting subtlety. "Fortunately, parametric polymorphism was not integrated into the Java Virtual Machine (JVM), so these examples do not demonstrate any unsoundness of the JVM" — yet they present unsound programs that "type-checks according to the Java Language Specification and is compiled by javac, version 1.8.0_25".

What gives? If a bad program compiles, how come JVM is still sound? See, Java has two type systems!

- JVM is the runtime, intended to be capable of loading even untrusted bytecode yet still maintain some type invariants. It manages memory and type metadata, and for this bad program will correctly identify type violation at run-time: "When executed, a ClassCastExceptionis thrown inside the main method with the message “java.lang.Integer cannot be cast to java.lang.String”"

- The compilers uses a distinct more complex static type system. The question of soundness is: can any program that passed the compiler cause JVM run-time type errors?

  + Java language allows you to write type casts.  These are deliberate "trust me" holes in the *static* type system, whose specified semantics is: JVM will check type at run time and raise exception if not as programmer promised.
    As https://typing-is-hard.ch/#what-about-unsafe-casts says, since these are deliberate, and fall back to meaningful run-time type checking(!), let's ignore them — redefine "soundness" as: can a program with no exclicit casts cause a run-time type error?

  + Generics only exist in static type system!
    They are invisible to JVM (aka "type erasure"), they compile into dynamically checked casts.  
    Their soundness goal was that the compiler can prove these implicit casts will never fail — e.g. you can only put candies into ArrayList<Candy>, so arr.get(0).eat() is guaranteed to give you a candy you can eat.
    That paper demonstrates a simple 17-line program with no explicit casts that compiles yet causes run time type error.

---

If you think what type soundness means, ALL statically typed languages have 2 type systems! There are execution semantics — what it means to, say, compute number + number. And there is a static language of talking about types that aims to predict / prove the types that will be involved at run time. Static checking fails if they don't match. Dynamic checking largely fails when you don't do it :-) But also when you erased info you needed to do it.

choeger · on Aug 10, 2020

What you call "dynamic checking your types" is actually the same as having an algebraic uni type and inspecting its construction.

A type system is a means to prove the absence of certain errors in a program.

Think of a type system as a park pilot that sounds an alarm before you hit an obstacle. What python does is the crashing sound when you hit the obstacle. Forth is finding the dent days later and wondering where the eff that came from. Python is more conclusive then Forth, but you still have a dent.

rtpg · on Aug 9, 2020

After hearing "but Typescript is unsound!" a billion times.... kinda glad to se that being unsound is the "state of the art".

I do wonder how far you can go without letting go of unsound-ness in practice though. Convenience is nice, guarantees are very very nice

fluffy87 · on Aug 9, 2020

The state of the art is Idris, which has a sound, decidable, and extremely powerful type system. Most state of the art mainstream languages like Rust have sound but undecidable type systems.

Many older languages with ad-hoc type systems are unsound, but there are also some older relatively used languages with powerful and sound type systems like Haskell.

gpm · on Aug 9, 2020

I don't see how this article supports the idea that "unsoundness" is state of the art?

elmo2you · on Aug 9, 2020

It may depend on age, overall knowledge of CS (including its history) and level of experience with multiple programming languages. Maybe the article itself does not literally support the idea, but it might still be inferred from it. It might also be the resulting discussion and responses, or additional field experience, that might give rise to this conclusion.

I personally think the whole typing movement of the last decade has been mostly a red herring. What I mean with that is that it appears to try solve the "wrong" problem. While strong typing certainly can solve a whole class of bugs, my impression is that where this has been required most, ended up always situations where horrible overall design and code quality were the actual main problems (not typing).

Coming from a (embedded) assembly and C direction myself, I have nothing against strong typing. But what has consistently rubbed me the wrong way about this modern strong typing "movement", both intuitively and rationally, is how it appears to fix something that is essentially broken on a whole different level. No amount of strong typing is going to fix that.

I know, nothing much actionable or concrete here. Just my opinion. Regardless, this old geezer certainly agrees with the overall impression that "unsound" is pretty much the state of this art, these days.

bitwize · on Aug 9, 2020

Static typing takes the burden of verifying certain forms of program correctness off the programmer and puts it in the compiler where it belongs, thus expanding the space of correct code the same programmer can write per unit cognitive load. The more kinds of things you check for in the type system (static lifetimes in Rust, for instance), the more benefits you get. It doesn't serve as a substitute for good taste, because while a mediocre programmer's reach is extended, a great programmer's reach is even further extended. Mediocre programmers can thus meaningfully contribute to embedded and kernel-level code under the tutelage of a good programmer without risking blowing the whole thing up, and increase their skill.

elmo2you · on Aug 9, 2020

I get you point(s) and I guess you mean well. Statically typed languages do essentially what you said. They mostly (if not only) move data validation from run time to compile time. In dynamically typed languages, this often doesn't happen in the first place, while with static typing you are pretty much forced to.

I have a little over 30 years of programming experience, the majority of it in statically typed languages (and quite a few). I have absolutely nothing against them (on the contrary).

I still remember the rise in popularity of dynamically types languages, and how/why they were promoted. Faster development and code simplification, by only doing data validation where it was really needed (at run time), among them. Whether that was a good development, depends on where you stand. Do you care about correct software? Are you a business making software for profit, minimizing development costs? Are you part of a constantly growing market, where programmers who properly understand CS fundamentals are increasingly harder to find (or just more expensive)?

Either way, to me it's no surprise that dynamically typed languages became as popular as they did, nor would I say that it is totally without merit (depending on your perspective). Sadly, it's also a heck of a lot easier to make utter garbage with dynamically typed languages. Not in the least because a programmer would no longer be confronted with their intellectual excrement at compile time. Unless all edge cases are properly tested during dev time, those usually become "user problems" during run time.

People like me have frequently warned consultancy clients about the dangers of technologies and languages based on dynamically typed data, but more often than not other factors that were deemed more important. Fair enough, at least from a business perspective. Maybe not so much from an ethic and/or legal liability standpoint, but somehow the software industry has managed to create itself a rather unique immunity from legal consequence as the result of horrendous software quality (so good luck selling a need for better software).

Sure, the contemporary static typing movement (as I like to call it) does appear to have genuine intentions to assist programmer (some of which should probably not even be allowed to code) to write substantially less buggy code. But I certainly do not agree with this idea that static typing will extend both mediocre and great programmers. In fact, if any of contemporary statically typed languages (or worse: ad-hoc extensions to dynamically typed languages) make somebody write substantially better software, then I firmly believe that this person either has fundamental CS related problems, or is being rushed/pressured too much to ever produce anything of considerable quality. I sincerely doubt that a statically typed language (or language add-on) will ever fix either of those.

Such programmers will nonetheless still be good enough for a lot of regular programming gigs though. But, for the love of the gods, please keep those people miles away from anything embedded or kernel-level code.

Again, I think you mean nothing but good. You may even know far more on the subject than I can expect, just based on your response. Either way, I certainly don't mean to offend you. Still, whenever I read a response like your's, I can't escape the feeling that it sounds more like a repeated mantra than that in comes from a deep understanding of the actual problem, including its long/complicated history.

bitwize · on Aug 10, 2020

> Faster development and code simplification, by only doing data validation where it was really needed (at run time), among them.

This is a trap many programmers fall into. You should be parsing, not validating[0]. If you are ingesting JSON data, for instance, what you need is not a generic JSON parser but a parser that only accepts JSON representations that map to a particular, static data type and throws parse errors where in a dynamic language you might find "validation errors", for example, an unknown data field or a field value of the wrong type.

Getting the types right early doesn't substantially slow you down overall, and yields a result that's much, much easier to get correct. Any such slowdown is a greater "capex" in the form of initial development, that's more than offset by much, much lower "opex" in the form of maintenance.

> In fact, if any of contemporary statically typed languages (or worse: ad-hoc extensions to dynamically typed languages) make somebody write substantially better software, then I firmly believe that this person either has fundamental CS related problems, or is being rushed/pressured too much to ever produce anything of considerable quality. I sincerely doubt that a statically typed language (or language add-on) will ever fix either of those.

This is wrong, and it's wrong for "theory vs. reality" reasons. Take one of the languages in your background: C. What you see may be true of a hypothetical programmer who knows all the arcane rules to C and can avoid tripping one of C's many, many UB landmines the first time, every time. But most programmers in the real world are human beings who may not have had their coffee. And they are going to make mistakes and unwittingly trip UB landmines. And this is as true of great C programmers as it is of beginners. You could build some strong rules around C, such as "allocate memory as little as humanly possible" and "avoid threading unless you can justify needing the performance gains" that will help programmers stay well clear of the traps inherent to C programming. Or you can switch to Rust, and fearlessly write code that allocates memory and even threaded code, because Rust's type system guarantees memory safety and a lack of data races.

[0] https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-va...

elmo2you · on Aug 10, 2020

> This is a trap many programmers fall into. You should be parsing, not validating.

Maybe it was an unfortunate choice of words on my part. Additionally, and I only meant to indicated how it was promoted/advertised, rather than make any statement about weather it was a sound argument. Either way, at the end of the day, a parser is a validator all the same.

Sure, more (domain) specific parsers indeed make better code (duh). That's true for both compile time and run time parsing. But how often, in real life business situation, is there enough time to create a solid architecture and fully fleshed out (domain) specific (run time, or compile time) parsers?

Again, I'm not disputing that statically typed languages may have a potential for creating more structurally healthy code. However, from my experience it still takes a hell of a lot of time and effort (for which there often is no room in real life business situations). Dynamically typed languages often create seriously flawed software, no doubt about that. But I'm convinced that this is more because many programmers just don't do what they actually should be doing in any case (in any language).

Statically types language can alleviate that problem a little, by forcing programmers to think more about what they should already be thinking about in the first place. But it is not a magic bullet. I'm not even going to discuss the downsides from overly specified/specific software code.

> This is wrong, and it's wrong for "theory vs. reality" reasons.

In case you missed it: my main gripe with all of this is that I just can not find any of all these great improvements, that evangelists keep pushing, in actual real world examples (at least not within my personal experience). Yet here you try to convince me that my point it too much theory and telling me a story about how real world programming actually works. Seriously? I mentioned C to indicate from which direction in the industry I come from, nothing more. Well, maybe also to indicate that I've been doing this for some time now.

No need to tell me how programmers work (plenty of experience with that), or making excuses for how it's just human nature for them to fuck up if they aren't held by their hand. I might have bought that kind of bullshit 25 years go. There have been plenty of attempts, purporting to to give programmers better tools and assist them writing better software. Most of them with only good intentions. Nothing new about that, about as old as commercial software itself. However, I've seen a rather depressing trend how all of those hardly ever held up in reality, over time. Whether that is because the knowledge level of programmers constantly degrades, or if it is because such solutions promote laziness, that might be more of a philosophical discussion and hard to establish as fact.

I find it at least a bit ironic though, how I apparently am being schooled on how I "just don't get it right", by what to me sounds and smells a hell of a lot like theoretical evangelical drivel itself. In fact, at this point I can't help but recall an iconic argument about how plants crave Brawndo, because of the electrolytes.

Keep on programming the way you do. If you are a good programmer, it shouldn't matter which language or tools you use. I wish you nothing but the best and good luck. Time will tell if Rust will survive the hype and actually deliver what it has been promising for a while. Maybe it indeed will. I'm not holding my breath though. Especially not since (nontransparent) corporate influence/control on languages and tools in becoming an ever growing factor (with all kind of problems associated with it).

IshKebab · on Aug 9, 2020

It would be interesting to know the practical implications - given how many of these languages are very popular I'm assuming it doesn't really matter?

fnord77 · on Aug 9, 2020

what's the connection between being undecidable and turing complete?

gpm · on Aug 9, 2020

If your type system is turing complete, then it's undecidable. This is easily seen through rices theorem [0] which states that all non-trivial (not always true or always false) semantic properties (i.e. about what inputs it maps to what outputs) of turing complete programs are undecidable. In particular what type something is going to be is undecidable.

Intuitively you should probably think of this as a slight extension of the halting problem, you can't always tell whether or not something (your type checking) halts. It's not that much of a leap to say that you can't tell whether it will halt after an even number or odd number of steps if it does halt (this doesn't immediately follow from the halting problem, but it is true). Imagine if you had a type that said "if it takes an even number of steps for some given computation to halt, I'm 'type A', if it takes an odd number I'm 'type B'". That's something a turing complete type system can do, and since you can't tell whether it will take an even or odd number of steps you can't tell whether or not the variable is 'type A' or 'type B'.

If it's undecidable it's not necessarily turing complete, but it's very easy to be accidentally turing complete, so it probably is.

[0] https://en.wikipedia.org/wiki/Rice%27s_theorem

guerrilla · on Aug 9, 2020

Type checking is undecidable for type theories that are Turing-complete. Type theories are Turing complete if they can simulate a turing machine. Roughly, type chicking for a type theory is undecidable if its not decidable whether every term has the type assigned to it by the programmer. The halting problem is the connection, which the website states but the author is inviting bug reports, so maybe its a good idea to ask them to clarify it.

pcstl · on Aug 9, 2020

tl;dr: Every Turing Complete system is undecidable - i.e. for every Turing Complete system which answers "yes" or "no", you can't determine in finite time whether, for any given input, the program will accept it or not.

Long version:

Being Turing Complete means that a system can compute everything that a Turing Machine can compute.

For a long time, it was an open question whether there was a way to determine whether, for a given statement, one could determine whether it was universally valid.

Alan Turing's creation of the Turing Machine was originally in order to try to answer this question. The idea was that to determine if a statement was universally valid, one would need to be able to mechanically derive the statement from the axioms of logic - i.e., there would be an algorithm (which Turing formalized as a "Turing Machine") that would be able to take the input and decide whether it was universally valid.

Turing then proved - likely his greatest claim to fame - that such an algorithm could not exist - as one can always build an input for which it will be forced to give the wrong answer.

Hence, Turing showed that for any Turing Machine - or any system equivalent to a Turing Machine - which answers "yes" or "no" to some input, it is impossible to decide whether it accepts its input in finite time.

teen · on Aug 9, 2020

Plz add dart to your list :)

ThreeFx · on Aug 10, 2020

If you have resources for dart I'd be happy to include it :)