Hacker News new | past | comments | ask | show | jobs | submit login
Why don't more languages offer flow typing? (ayazhafiz.com)
192 points by fourteenminutes on April 4, 2022 | hide | past | favorite | 122 comments



This is not a complete answer, but covers some languages. If you program too exclusively in dynamically-typed languages, you can be too used to not thinking about how physically large your types are, because you work in a world where everything is boxed, and allocations so plentiful you don't even hardly have a way of thinking about them because your language does them at the drop of a hat, and so on.

But there are many strongly-typed languages that look at types at compile time and decide how large they are, by which I mean, how many bytes of physical RAM they take. These languages have gotten better and better over time at the APIs they offer that give the benefits of this without incurring the programmer costs, but under the hood, no matter how slick they've gotten, there is still a mapping of type -> size & layout. In these languages, this sort of narrowing is a great deal more complicated, because it implies a new type, which will probably require more allocation. For instance, in the example given between a "string" and a "number", that definitely looks like two differently-sized types to me. This would become very complicated very quickly as you start having to allocate these things in ways that can't be transparent to the user, you have to pass these types around, etc. etc.

One would expect that the affordances of statically-typed languages would end up being very different. And in fact, they are. Many modern statically-typed languages can solve the same problems outlined in the post, they just solve it in a different way. In this case, an Either, or more generally, a sum type. Then "getViewsOf" returns a value of that sum type and you can switch on it based on the value. It isn't exactly the same, in fact I'm not sure there's one aspect of it that's the same let alone the whole, but it solves the same problems.

So my suggestion would be that there are in fact a lot of languages that essentially have the same feature. They just don't call it "flow typing" and it's not spelled the same way.


Dynamic and strong typing are not opposed (dynamic and static typing are opposed, and, to the extent the distinction is valid, weak and strong typing are opposed), dynamic doesn't mean everything is physically boxed (most dynamic language implementations don't box some subset of small primitive values, usually including bools and small ints), and, in any case, flow typing is a feature of static type systems (though some of those offering it are optional static typing systems for dynamic languages.)

But you kind of get at the real answer toward the end despite talking around it most of your post: Sum types plus pattern matching solves the same set of problems union types with flow typing solves, and a bunch that the latter doesn't (like composability.)


Remarkably, a post written in generalities speaks in generalities. None of those things may be "necessary", but there sure is a lot of all the things I said, aren't there?

What you call "speaking around" I call an important thing to understand about a lot of languages: At some point, your data will be physically laid out in memory. If you don't care about performance... and I mean that as a perfectly valid choice in many situations, not a snark... it doesn't much matter how it is. But if you do, and you selected a language based on that, it matters a lot, and you have to view every type system detail through the lens of "what does it look like in memory?" if you want to understand it. The choices this class of language makes for their type systems will never fully make sense if you are not paying attention to this, and also if you gloss over the legitimate difficulties that arise for any sort of "why don't they just...?" sorts of questions.

(In particular, you really don't understand how good Rust is until you look at it through this lens, then look at just how much simultaneous power and convenience they've drawn out while never compromising on the memory issues. It's an incredible piece of work. It probably isn't "optimal" because such things don't exist in our real universe, but it probably is as close as we'll ever see for that class of langauge.)


> dynamic and static typing are opposed

Any given typing judgement we might want to make about a particular term can be checked statically or not, and it can be checked dynamically or not.

If the language has terms (... that might be evaluated) that don't get checked statically or dynamically, then it is going to be weakly typed (with respect to those typing judgements we're considering).

If we've already checked something statically, we "know" that it will always hold, and checking it dynamically is in some sense redundant (provided we actually believe our checking to be correct and our environment to be sufficiently robust that things don't change out from under us) and I think that's a part of the reason the two feel opposed, but in principle you could check a property both statically and dynamically for a particular term.

Step back a little to the language level and they aren't at all opposed; you might want to check some properties statically and some dynamically (eg. static checking of class/interface vs dynamic nullability in Java) or check a given property statically for some terms and dynamically for others (eg. gradual typing).


Something that's checked dynamically is not a "typing judgment", by definition. It's a proposition established at runtime, possibly even with non-trivial data attached describing "how" the dynamic check succeeds, and possibly affecting program operation in its own right as that proposition object gets "passed" to downstream functions that depend on that dynamic check. These are two altogether different things.


Sure. Tip: if something's true by definition, it's usually not interesting. Substitute the appropriate phrase - "observation of a property which, in the static context, might constitute a typing judgement"? Note that we're here specifically considering things that can be typed statically, as outside of that setting there is no question of whether static and dynamic are opposed.

The rest of your comment seems concerned with the fact that the tagging infrastructure typically needed to check these properties at runtime can also be used for other purposes. That's true, but I don't see the relevance.


That probably reads a little snarkier than appropriate. My apologies for the tone.


I think swift combines sumtypes with flow typing in an interesting way, where if you use a guard statement it then propagates that information into future uses of the type. If I remember correctly Kotlin also does something similar.

Pattern matching and sum types help, but they solve different problems.


Isn't this sort of an orthogonal problem? Flow typing implies that your type is in some way unknown at the time the code fragment is evaluated. I think this can mostly happen in two situations:

1) The type is generic: A function may be called with a different type on each call site - but for each particular call site, the type is known at compile time.

2) The type is polymorphic - i.e. the full type is not known at compile time at all.

When the first case occurs, a compiler could either go the C++ way and generate different, non-generic variants of the function - or go the Java way and just pretend the type is polymorphic.

For the "C++" option, I think flow typing would fit very well without introducing additional complexity: For each variant the compiler generates, it knows which concrete type is used, so it can evaluate the condition at compile-time and just drop the branches that don't apply for a particular variant of the function.

For polymorphic types, on the other hand, the additional complexity is already priced in: You need some kind of mechanism to represent values of different structure anyway (pointers, unions, whatever) - so flow typing wouldn't be much more than some syntactic sugar to save you from writing casts you'd have to do anyway.


Another very common case in TypeScript (and other languages) is union types. Other languages have constructs for checking the underlying type of a union type directly, but typescript is smart enough to figure it out based off of the properties of the constituent types and what you've checked so far in your code.

e.g.

    interface A { type: 'a' }
    interface B { type: 'b' }
    type C = A | B;

    const c: C = ...;
    if (c.type === 'a') { /* c is of type A here */ }


That's basically case 1, right? The compiler knows which type is being used at each call site, so it can generate a separate function for each type in the union and eliminate the type check/dead branches.

I guess the exception would be if you have a non-homogenous array that you try to map over. In that case there's probably no way around boxing the values.


For consts and function arguments yes, I think. But you could have some mutable variable or field whose value depends on runtime state. In that case, you could have an actual polymorphic type.


But the previous example isn't that case.

if (c.type === 'a') { /* c is of type A here */ }

This is dynamic typing, this code is checked at runtime, and it's leveraged statically.


Yes, that's correct. In the GP example, c was const, so the type can be determined at compile time.


I was writing TypeScript, where const just means the variable is not reassigned. For instance this is valid:

    const c: C = Math.random() < 0.5 ? { type: 'a' } : { type: 'b' }.


Ah, I'm sorry. You're right of course!


It actually happens all the time, which is why the suggestion is valid (namely that flow typing is useful). Flow typing is deeply related to union types, at least for expression-based languages.

Take rust. You have a function that on one path of a branch returns an int, in the other, a float. Rust says that's an error. I say your function returns a union type :)

And this is totally pervasive. A function which returns an int in an if branch (w/out an else) and elsewhere returns nothing actually returns an optional.

A variable which is assigned in one branch to a reference to an int (an l-value in languages that screwed this up, like rust; a ptr int in languages that, sensibly, use types to encode this property, like algol68) in one branch, and to an int value in the other, is technically a union type. If that union is added to a to a float - that shouldn't be an error! The int should be widened, the ref int should be dereferenced, then widened.

OTOH, if you try to assign an int to that union - now that is an error, because values can't be assigned to. You can only assign to types which represent a location in memory, like ref int, unlike int.

In the above discussion, the effect of control flow on the types is critical. Languages like rust ignore this, and to my mind, their ergonomics suffer considerably because of this. C++ side-steps this through statement based syntax, which makes variant types clunky.

Union types are beautiful and powerful, but there seems to be a lack of appreciation for the subtle and deep ways they impact language design. You have to design them in right from the beginning; it's a tricky thing thing to get right, and doing so absolutely requires flow typing.


Sum types are often used in cases where the type is unknown at run time. In C++ the analog would be std::variant. In Rust it would be enums. In Haskell its algebraic data types.

A first class sum type in c++ ala Rust or Haskell would certainly be appreciated, as the clunkiness of std::variant/std::visit is a well known annoyance.


Could you expand on your first sentence? I don't understand what do you mean that sum types are used when the type is unknown at runtime.


They meant unknown at compile-time, not run-time.


> In these languages, this sort of narrowing is a great deal more complicated, because it implies a new type, which will probably require more allocation.

Flow typing need not imply any kind of "narrowing", though. It could simply endow a control-flow branch with a statically-typed proof object, asserting whatever post-conditions were established in the branch. The "narrowing" could then become a separately-coded step, taking that proof object as part of its input arguments.


That sounds great at the napkin-sketch level.

But if you mean what I think you mean, now your entire compiler suite needs to be reworked to change its internal concept of what guarantees a "type" has. Previously the compiler was able to assume it was a contiguous chunk of memory, but now it needs the concept of a non-contiguous chunk of memory representing a certain type, and it needs everything everywhere updated to accommodate that, and it may need to add code to make copies for things that didn't used to need to make copies if you create an array of this new type, etc. It goes on and on.

I'm not the world's expert on all this under-the-hood stuff, but there is still something to be said for learning a bit of C or something deep enough to understand how it interacts with the machine, and C probably is still the best language for this despite its manifold other flaws. There's all kinds of fun games you can play in conceptual math space to make awesome programming languages, but if you want your language to play in the "goes fast" space you will find only some of those games map to anything efficient on real machines. (There is certainly a space for languages that just provide the nice abstractions, too; between "C performance" and "Python performance" there's a lot of room to play!)


Flow typing is purely a static analysis process. The representation of the type doesn't change, just the range of possible values in a scope constrained by some runtime condition.

Some things that might have triggered warnings no longer need to, because if the runtime conditional evaluated to true there's no way the warning could take effect.

Consider this little bit of C code:

  void foo(int x, unsigned y) {
    if (x > y) {
      printf("but it's bigger!?");
    }
  }
Due to the idiosyncrasies of C if you call foo(-1, 1) the test will promote x to an unsigned and the test will say x is bigger than y. There's a warning in most C compilers for it, because it's the sort of invisible bug that'll wreck your code.

What's annoying, though, is if you do this:

  void foo(int x, unsigned y) {
    if (x >= 0 && x > y) {
      printf("but it's bigger!?");
    }
  }
And still get the warning. The compiler should be able to infer that the test x >= 0 means there's no fault path any more. Clang 13, at least, still warns.

A C compiler using flow typing would not issue a warning for the second implementation of foo. Neither x nor y change representations. The execution path through your compiler for producing byte code doesn't change. A few of your tree structures might pick up an extra annotation for the provable facts the warning generator can use to improve its output, that's all.


Pretty much. Flow typing like TypeScript does it is only really sensible when your run-time supports dynamic typing but your compiler checks types statically. In other words only a language similar to TypeScript could have a type system similar to TypeScript's.


Way before 1.0 (IIRC), Rust in fact used to have structural records, and wanted to try to only have them. But due to the challenges described by you and the article, they went back to regular nominal ADTs.

I do think think that was the right call at that time.


In the end, I feel like structural typing is one of those attractive nuisances. It sounds good on the outset, and it does provide a limited amount of convenience, but then it turns out not to be that convenient, and it removes a lot of safety and flexibility by taking intent out of the thing.


I think the majority of the convenience of structural typing can be had with explicit casting between structurally equivalent types. It eases pressure on the type checker (checking that two types are structurally equivalent is pretty quick, checking that any expression satisfies something structurally is a little harder) and gets you ultimately what a programmer wants - convenience.


Go's interfaces are a decent 80/20 answer. You pay for them, you opt in, and they sit on type of the base type system, they aren't the primitive the entire type system is based on and they aren't what the compiler works with at the most basic level.

I have on a number of occasions wished I could define an interface that could be fulfilled by a field without me having to write a method, but it's not a use case that comes up enough to bend a type system design around. (And there are enough other workarounds, like embedding a custom type and such that it's fine.) I think a lot of people underestimate just how large every last detail of every design decision looms at this level of a language. Too many languages doing something slightly convenient for a relatively small set of flashy use cases without fully accounting for or realizing the costs it imposes.

(This is a pretty hot take but IMHO Python is almost imploding itself constantly adding features that match that description.)


> Go's interfaces are a decent 80/20 answer. You pay for them, you opt in, and they sit on type of the base type system, they aren't the primitive the entire type system is based on

That's interesting because I see them as the exact opposite. You're not saving or gaining anything of significance, and it's just increasing confusion.

Penny wise, pound foolish, if you will.


I'm not sure what you mean by "not gaining anything of significance." Go without interfaces (and for simplicity let's ignore 1.18's generics for a moment) would not be a useful language at any significant scale. You'd end up with a lot of "by hand" interfaces of structs full of function pointers (or method closures), only without the compiler support.

Nor am I sure exactly what "confusion" is being increased.


> I'm not sure what you mean by "not gaining anything of significance." Go without interfaces (and for simplicity let's ignore 1.18's generics for a moment) would not be a useful language at any significant scale.

Why would go not have interfaces? Where did you make that up from?

Go would have nominative interfaces Like most other languages.


There is a fundamental tradeoff between cost-free upcasts and efficient layout, but languages could offer both options and many in between.

IMO in general: in the face of tradeoffs, moving the goal posts to let the programmer decide is an unexplored 3rd option.


> There is a fundamental tradeoff between cost-free upcasts and efficient layout, but languages could offer both options and many in between.

Not sure what the relationship to my comment is there, this tradeoff exists in nominative land, it’s got nothing to do with structural typing.


That depends on where it's used. For example, Rust's traits bounds are structurally typed, for exemple making a function that accepts a type that's Summary + Display. This "Summary + Display" type doesn't need to have a name, and in that case it's great. On the other hand sometimes you want to have two strings, one that's a name and the other the address, and not substitute one with the other. I don't think there is "a" solution, only multiple solutions and tradeoffs.


> For example, Rust's traits bounds are structurally typed, for exemple making a function that accepts a type that's Summary + Display. This "Summary + Display" type doesn't need to have a name, and in that case it's great.

That's still nominative. The bound is not named, but it's based on names, not on structure. Otherwise you couldn't have marker types in such a bound, or would always have all of them and be unable to opt out.

> I don't think there is "a" solution, only multiple solutions and tradeoffs.

I have a hard time seeing that: if you have a structure, you can always give it a name. But you can also give different names to the same structure. So a nominative type system is more flexible and powerful at the very limited cost of having to name things. And even then, nominative type systems can support anonymous types (where the name is simply implicit / automatically generated) e.g. lambdas in C++ or Rust.


I think that's structural. My understanding is the bound is based on structure, as there is no way to differentiate between a Summary + Display in a function and a Summary + Display in another. On the other hand, with Rust's enums (that are nominal), you can have enum A { Toto, Tata } and enum B { Toto, Tata }, and both with be different. A function accepting A will not accept B. A function accepting T : Summary + Display will accept any type that implements both traits.

From my understanding, nominal vs structural typing is about how you consider group of types. For structural typing, types that contain the same thing are the same. For nominal, that isn't the case.


I entirely agree with you! And all of these are things covered in the post.


Yeah, I hadn't heard the term "flow typing" before but I thought the same thing, in other languages I've used you'd use a Result type and a switch to do what the author was doing.


It sounds like you are very confused about types and values. Static type checking does not incur any extra allocations at runtime.


The answer is simple: because other languages don't need it - they have different features to deal with it. The author even mentions it: pattern matching.

Just that he picks a language that doesn't support union-types. But that doesn't mean that flow typing would be necessary here - it means that the language(s) should support union-types and extend their pattern matching accordingly.

In fact, I would say that flow typing is almost like a workaround for missing pattern matching.


> In fact, I would say that flow typing is almost like a workaround for missing pattern matching.

If flow typing is 'type narrowing' (the article kinda dragged on pulling in unrelated concepts so it lost me), then if anything, it is at least as good as pattern-matching/switch-blocks. At least in Typescript. That is because it works in switch statements and non-switch statements and provides the same guarantees. I don't get this argument.

Consider something like:

    data: {pages: number} | null = f()
    if (!data){
       return 0
    }
    return data.pages + 5

Sure you can switch on the 'data' as well (exhaustive type-switches are a thing in Typescript too), but thats a syntactic preference. It surely is nice to get the same guarantees without _needing_ a switch statement.


> I don't get this argument.

Maybe it's because you focus on typescript here. But typescript is in a position that other languages are not: it has to make things work within an existing ecosystem, in particular with an untyped language. Under these constraints, flow typing might be the best solution. But that doesn't mean it's generally "at least as good as pattern-matching/switch-blocks".

Btw, pattern matching and switch-blocks are very different things - maybe it make it easier for you to understand to dive into a language that has native pattern matching and no switch-blocks at all because from what you write it seems to me that you have not been exposed to pattern matching much yet.


> It surely is nice to get the same guarantees without _needing_ a switch statement.

Is it?

    let data: Some(usize) = f();
    match data {
        Some(pages) => pages + 5,
        None => 0
    }
Seems pretty nice.

Or if you want a more structurally similar code with a guard (and removing the useless bits):

    let Some(pages) = f() else {
        return 0
    };
    pages + 5


> Seems pretty nice.

Never said it's not. I'm just saying it's nice to have type narrowing everywhere since it includes block based switching/matching as well (what you're arguging is nice, which I'd agree)

You'll have to use your imagination here because in typescript narrowing works on all conditionals (ternaries, boolean switches, etc), so naturally there are many more examples where it's useful. It's easy to trivialize any specific example.

E.g. they're useful in JSX when you want to do a one-line type-safe boolean switches like: {node && <RenderNode node={node}/>}.


I don't think pattern matching and union-types makes narrow typing useless. Rust has both pattern matching and union types but still implements some specific forms of control-flow based typing.

For example, consider this Rust snippet :

    pub fn positive1(x: isize) -> Option<usize> {
        if x > 0 { Some(x as usize) } else { None }
    }
Unless I'm mistaken, without narrow typing, this cast would be impossible, and there would be no way to avoid the redundant runtime check caused by try_into().unwrap(), that is not even optimized unless you trick the compiler.


> Unless I'm mistaken, without narrow typing, this cast would be impossible

same sized integer casts in Rusts are no-ops [0]; the conditional isn't type narrowing, it just avoids the cases where the cast would not preserve the same semantic value.

[0] https://doc.rust-lang.org/reference/expressions/operator-exp...


Right. Thank your for the correction.


You need flow typing when you do pattern matching on generalized abstract data type (GADT). Here is an example in Java

    sealed interface Foo<T> permits Impl {}  // union type
    record Impl() implements Foo<String> {}  // product type

    <T> T m(Foo<T> foo) {
      return switch(foo) {
        case Impl impl -> "hello";  // flow typing T=String 
      };
    }


By that definition, many (maybe most) languages then already support flow typing.


that's not the same as what is defined in the article, and it is supported by plenty of languages


Yeah, the same code would look much clearer in a language with union types. Heck even Swift cribbed them.

switch resp {

  case Result(val, meta):

    doStuff(val)

  case Error(msg, code):

    logger.main.debug(msg)

}


You mean sum types, not union types (TypeScript has union types, your example uses sum types.)


An union type is the C attempt to implement something like sum types, that other languages extend a bit to literally implement sum types.

Those two words live on different contexts, as unions are about a memory usage pattern, and sum types are about conceptual software design, but they often go together on the same language feature.


> An union type is the C attempt to implement something like sum types

Sum types, unions (in the C sense), and union types are three distinct concepts.

Wikipedia's article confusingly merges the latter two concepts together but they're different thing. TypeScript[1] and Scala[2] have union types, but they have nothing like C's notion of "unsafely reuse the same bits in memory to be interpreted as different things".

[1]: https://www.typescriptlang.org/docs/handbook/2/everyday-type...

[2]: https://dotty.epfl.ch/docs/reference/new-types/union-types.h...


I did mean sum types. Thank you for the correction.


Nobody's mentioned dependent typing yet, but languages like Idris and Agda will "narrow types based on control flow" (in fact, since they're pure-functional, control-flow is the same as data-flow). For example, zipping two vectors of the same length:

    zip: Vec n t1 -> Vec n t2 -> Vec n (t1, t2)
    zip Nil Nil = Nil
    zip (Cons x xs) (Cons y ys) = Cons (x, y) (zip xs ys)
Here the 'Vec' type has two constructors, Nil and Cons. Since their types specify the same length, the type-checker knows we can ignore the 'zip Nil Cons' or 'zip Cons Nil' cases.


This is amazing, but I don't understand the justification :

> Since their types specify the same length, the type-checker knows we can ignore the 'zip Nil Cons' or 'zip Cons Nil' cases.

Specifically, I have trouble understanding "their types specify the same length". Whose types ?


The types of the vectors: the length is a generic parameter to the type. So a Vec 4 Int and a Vec 5 Int are different types.


Oh right, the output is parametrized by the input. I suppose this is what dependent types are.

That's really awesome. Thanks for the explanation.


> Oh right, the output is parametrized by the input. I suppose this is what dependent types are.

Dependent types is really about the ability to parameterise types over values, and reason on those values-dependent types.


You might read the Vector type constructor 'Vector N T' as "this is a Vector of T's of length N'.


Contrary to what many comments here state, I don’t think this is only useful in a dynamic runtime kind-of environment. I often think I could use some form of this in Rust and/or Haskell. In a very specific way:

I often want a single constructor/branch of an enum (sum type) to a be a type as well, specifically a sub-type of the full enum. So once I learn about what branch a value is, I can treat it like that and even first class pass it around – without unpacking.

I think this should be implementable in Rust without any runtime overhead as pure syntactic sugar. You get the same effect by creating a new struct for every set of values per branch and using those wrappers instead.


You can do this in Haskell with GADTs.

    data Up
    data Down

    data Foo tag a where
        Up :: a -> Foo Up a
        Down :: String -> Int -> Foo Down a
When you enable all the necessary extensions for this to compile, it gives you exactly what you've asked for. If you leave the tag variable polymorphic in a function argument, you can receive values of either constructor. If you specify Foo Up a, you can only receive values with the Up constructor.

It might be a bit more verbose than you want, with needing to declare types to use as the type level tags. (I looked into using PolyKinds and DataKinds to use the constructor as its own type argument, but GHC doesn't allow that type of recursion.) But it does do exactly what you've requested - it allows you to treat each constructor as a separate type in some contexts, and allow any of them in other contexts.


Definitely yes!

It's a loooong time ago, but the pattern exhaustiveness checker of GHC wasn't up to this last time I tried. But my guess is things are much better now and this might actually work.


There have been big improvements in the exhaustiveness checker. I know that it's not fully decideable with GADTs, so it will always have some holes, but it can handle this case now.


Yeah this would definitely be useful in Rust. It isn't in Rust currently because enum variants aren't distinct types, but that's definitely on the to-do list. I suspect Rust will get this eventually.


You want something like Haskell's Typeable class?


It's not really called out as such, but the C# 8 feature Nullable Reference Types is an example of deeply embedding flow typing into a language. The nullability state of a reference type can change based on many things you can do in code. It's really neat and extremely intricate.


It's exactly that.

e.g. inside "if (x != null) { }", or after "x = y ?? throw new ArgumentNullException()" then x is known not null.

C# has flow types, but only in this specific and limited way. You could say that about a lot of programming language features.


This is called Type Narrowing by the way. Control flow analysis is only one way this works in TypeScript.

Most languages don’t have the type system necessary to make this work in a sensible way.


In Kotlin this is called smart casts. It works great.

Doing a null check means that after that you can treat the variable as non nullable. Doing a type check, narrows down the type to what you just checked. Or going down a switch statement branch on the type actually implicitly does the cast as well.

It's both strongly typed and convenient.

Kotlkin even goes a step further and introduced contracts few versions ago that you can bind to functions. For example calling isNullOrBlank() on a nullable string changes to the type to nullable if the answer is false. You can write your own contracts for your own functions even. This is what the source code for this function looks like:

  @kotlin.internal.InlineOnly
  public inline fun CharSequence?.isNullOrEmpty(): Boolean {
    contract {
        returns(false) implies (this@isNullOrEmpty != null)
    }

    return this == null || this.length == 0
  }
I guess typescript does something similarish.


Yeah, they're called "type predicates" (or "type guards"). So you can do something like

    function isFish(pet: Fish | Bird): pet is Fish {
      return (pet as Fish).swim !== undefined;
    }
and Typescript is smart enough to let you use it in an if-branch

    // Both calls to 'swim' and 'fly' are now okay.
    let pet = getSmallPet();
 
    if (isFish(pet)) {
      pet.swim(); 
    } else {
      pet.fly();
    }
Typescript also has typechecker support for assertion functions. So like,

    function assertIsDefined<T>(val: T): asserts val is NonNullable<T> {
      if (val === undefined || val === null) {
        throw new AssertionError(`Expected 'val' to be defined, but received ${val}`);
      }
    }
Which we can use like so:

    function doSomething(pet: Fish | undefined) {
      assertIsDefined(pet);

      // safe to use without checking for null/undefined,
      // because our assertion function narrowed the type for 
      // usage later in the function 
      pet.swim();
    }


>For example calling isNullOrBlank() on a nullable string changes to the type to nullable if the answer is false

For anyone confused like I was, that's a typo, it changes the type to not nullable. Likewise, isNullOrBlank() is an extension of the listed isNullOrEmpty().

https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.text/is-...

https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.text/is-...


See also “Occurrence Typing” in Typed Racket. [^1]

> One of Typed Racket’s distinguishing type system features is occurrence typing, which allows the type system to ascribe more precise types based on whether a predicate check succeeds or fails.

One of the big advantages is that it lets a text editor use the added information to supply better auto-complete suggestions. It also makes it possible to eliminate some safety checks at runtime because the type checker has ensured that certain calls will be safe based on prior conditional checks.

[^1]: https://docs.racket-lang.org/ts-guide/occurrence-typing.html


Dart does this, or at least something very similar, and it is really nice. It's most useful for narrowing nullable types to their non-nullable counterparts after a null-check, but the subclass case works too.


As your sibling allready stated, in Kotlin it is known as smart casting. Good to see that the article allready recognizes the prior art from ML which was conceived in 1971.


Dart’s “..” operator, which allows void-returned function calls to be chained, adds no performance cost to the code and is one of my favorite pieces of Dart.


Doesn't "flow typing" seem more like a bandage for the if-statement in languages that have nothing better.

There is a language construct which is purpose built for unpacking sum types: The case expression (sometimes called match expression)

  case resp of
    Views n -> ...
    Error e -> ...
Inside each branch, we have access to a value of the more specific type.


Why should type narrowing be limited to just case, and not both case and if? I'd argue that both of these are examples of flow typing.


My point was that if your language has a dedicated facility for case analysis, you don't really need to use if statements for that purpose. You just use case. That's one answer to "why don't more languages have flow typing".


Yes and no.

Consider predicate types, or first-class null safety through unions.

If statements introduce propositions, which type systems could take advantage.

example:

  let idx = randomInt();
  if (0 <= idx && idx < arr.length) { // proposition introduced
    // idx now has type: int where 0 <= idx && idx < arr.length
    // irrefutable proof of safe indexing by type propositions
    let value = arr[idx];
  }


> Even in the presence of such tests, Java demands that the author cast their objects appropriately.

This is one of the many reasons why I've found Kotlin so refreshing after five years of writing enterprise Java

https://kotlinlang.org/docs/typecasts.html#smart-casts


I looove flow typing / type narrowing. HTDP taught how to think through it (data type narrowing / checking for and handling different types) while building functions. TypeScript let's me enforce the same process with intellisense warning me whenever I get off track. It's been a joy to see how quickly and solidly the code stacks up.


I'm trying to design a programming language that can be compiled into C99 and it's not at all clear to me how you represent these things statically.

The idea from a programming point of view is really nice and in a dynamic programming language environment everything is an object and you can pass whatever through any function. This doesn't work all that well if you want to translate your program into a statically typed environment, in a cost effective way.

You need some way to reason about the type information that hopefully doesn't create memory allocations all over the place.

This can be done by introducing tagged data types but it can also lead to a combinatorial nightmare where you need to generate a lot of extra code.

In general, I don't think all this complexity is warranted in a statically typed environment and depending what you are doing this doesn't end up being that important. Even if, it's a nice to have, it's mostly that, a nice to have.


Typically you wouldn't change the compiled code too much, it changes what is valid to type check and inserts casts as necessary. In general, flow typing just makes more programs type check.


You have to emit the plumbing for something like flow typing to work. You can't omit the type checking because you cannot account for all the possible call sites at compile time, that will lead to a combinatorial explosion. You do end up generating code to pass arguments with temporary type information.

Take the example of a variant type, which is what I'm most interested in.

You have something like type X = int32 | string | boolean | SomeCustomType

When you then write a function that accepts X as a type you need to be able to pass any of these things into that function. To do that you need a auto user defined type that can carry the information, in C that might look like this

struct { union { int a; char* b; boolean c; } data; Tag type, }

then your type assertions can work on this. If you wrote this in my fancy language:

if x != nil { // x cannot be nil here x.foo }

the equivalent C code for this would be something like

void f(X x) { if (x.type != WELL_KNOWN_NIL) { x.data.foo // we know it's safe to access foo (assuming we typed X as Foo | nil) } }

Trying to solve this at compile time without any runtime checks might be possible but I don't know that it would be better, it might be too slow if I have to solve some nasty combinatorial problems. This is at least somewhat easy to explain.


So I've been working on a personal project which is in a mix of C++ and TypeScript -- partially because of practicality and partially to learn TypeScript, and I'm really liking the language.

One question I have is whether there's any possibility that TypeScript could, in the long run, gain performance advantages over pure JS? That the compiler could leave behind some type information artifacts so that V8 (or similar) could use that information at runtime to optimize method dispatch and other operations?

Likewise with the flow pieces mentioned here, I gotta think that the VM could optimize out certain runtime checks on dispatch or conditional branching if it knows that the compiler has already checked for these?


I really enjoy Typescript, but every time I use it I think to myself, this would be so much better if were built on top of something other than JavaScript. Maybe AssemblyScript or some other TS to WASM target will come along with a better underlying type system and a good standard library. Dare to dream.


Yep. Except that the base semantics of the language really are tuned for JavaScript. So I can't see a separation being feasible between the two.

It would have to be a new language, sharing maybe 90% of syntax etc. but differing in some places where interoperability with JS has forced compromises (base number types is one thing I can think of. I'd like separate int and floats, etc.) and, yeah, targeting WASM etc. And honestly, I'd be game for that. Esp if it came with performance advantages.


What you're describing is essentially the origin story for Dart.


Yes that's what I meant by proper type system - int, float, decimal... proper date/time/timezone support. To name a few :)


Not quite what you are dreaming for, but there's https://typescripttolua.github.io/


V8 already does this.

When you write your code in a way that is akin to what you would do in a static environment, V8 emits additional type checks, if these pass, then it will run your code through a optimized version of the code that makes certain assumptions about the data types in use. If these type checks fail, then V8 will deoptimize the function/code.

Deoptimization needs to happen because the assumptions made about the execution of the code was wrong and the optimized code cannot handle this special edge case. V8 will then revert to less optimal code for the specific case but this code is more general and can handle the special case that occurred.

V8 can and will toggle between optimized and unoptimized versions of your code now and then but it has limits. If it cannot settle on a version of your function that is optimized it will stop trying to optimize the code because the cost of doing so is significant.

When you write your JavaScript as if it was more statically typed than it actually is, you do enjoy optimization benefits from V8.


Unfortunately a big issue in TypeScript is that objects can and surprisingly often do have the completely wrong type. Some libraries even blatantly violate their TypeScript types. So any optimizations which take advantage of TypeScript types are likely to lead to very hard to debug behavior.

A better option is to make/use a language which interfaces well with JavaScript but is not JavaScript and actually checks it’s types. If the type system is actually be sound, and any code going in from JavaScript is type-checked at runtime, then compiler optimizations are reasonable.

As for your last point, I know that JIT-compilers like V8 can and do infer types at runtime (e.g. by noticing if a particular variable or function arg is always set to the same type) and will compile specialized function overloads which take advantage of the inferred type for optimizations (e.g. using fixed-width integers even though all JavaScript numbers are doubles).


fyi Java 16 did add pattern matching for instance of so you can avoid the cast.


I was going to point this out, too.

I thought it also supported cases like:

if (foo instanceof Bar) { // foo is typed as Bar in this block }

Is that true, or did I get that wrong?


  if (o instanceof String s) {
    var foo = s.indexOf("bar");
  }
Java 17 introduced a preview of the same feature for switch statements

  return switch (o) {
     case Integer i -> String.format("int %d", i);
     case Long l    -> String.format("long %d", l);
     case Double d  -> String.format("double %f", d);
     case String s  -> String.format("String %s", s);
     default        -> o.toString();
  };
Future versions will have improved support: http://openjdk.java.net/jeps/420)


I use the `o instanceof Class c` thing all the time in my hobby compiler on Java 16 right now. The Java 17 version will greatly simplify much of my codebase.


Kotlin does this, it's great.


This question actually has a simple answer:

Most languages have embraced statements over expressions for a lot of language constructs.

This causes a lot of issues:

An if-statement is in essence a function from boolean to unit/() (i.e. from a barely useful type to a type with no useful information), while an if-expressions will contain enough type information to at least provide this kind of “flow typing”, (even if it induces an amount of boolean blindness.)

It’s even worse when you get to loop-statements, which often is a function from unit/() to (unit/() | bottom/⊥), compared to a map or a reduce or fold, which again provide enough type information to provide this kind of “flow typing”.

Sometimes you will of course need to provide impure conditionals or loops, but that can be made explicit in the type system.

IMHO statements in programming languages are an anti-pattern.


Almost all of the languages I know that do "flow typing" (TypeScript, Flow, Hack), "smart casts" (Kotlin), or "type promotion" (Dart) are fairly statement-oriented.

Expression-based languages tend to have pattern matching which provides another way to solve the same problem.

Flow typing is most useful in imperative languages where code like this is common:

    if (foo is! Bar) return "not a Bar";
    foo.someBarMethod();
Early returns and other imperative control flow is idiomatic and it's annoying if the static type system doesn't understand it.

In a more functional or expression-oriented language, early returns and other imperative control flow like that is rarer, so there's less "flowing" to type over.


How does pattern matching solve the same problem?


It gives you a very nice notation for checking if a value has some type and, if so, binding a new variable that refers to it as that more specific type.


How does that solve the problem of statements being devoid of return information?

Also, what? Pattern matching differentiate between values of a single type, and while the assignment mechanism is a great nice-to-have, it still completely follows the basic type consistency that actually typed functions provide.

Statements have none of these qualities..


I'm sorry, but we seem to be talking past each other.

Why is it a problem that statements are "devoid of return information"? A statement occurs in a position where, by definition, no value it produces will be used.


> Most languages have embraced statements over expressions for a lot of language constructs.

Some of the most used languages with flow typing have statements and follow them for typing, so that is emphatically not the reason.


My view might of course be a bit outdated, do you have any examples?

(Specifically for this point: “and follow them for typing”)


The example in TFA. TypeScript.


So my point stands…?


What’s the difference with type inference? Is it that flow typing can happen at runtime for just in time /noncompiled languages?


Flow typing is basically a kind of type inference where the derived type of a variable is allowed to be different in different places of the code, based on control flow. However, not all type inference is flow typing.

Flow typing is definitely a thing also in compiled languages, one example is Crystal, which is both compiled and has flow typing.


Unless you build a compiler around it it's probably quite hard to implement


... isn't this just OOP inheritance/subtyping/interface implementation with if/then checks for the more specific types?


I work on a language, Dart, which also relies heavily on flow typing (which it calls "type promotion" because apparently every language needs their own name for it). We use it both for subtype tests and null checks. It's really nice and a large net win for Dart especially given its history.

However, if I had a time machine and could redesign Dart from scratch, I would be tempted avoid flow typing and instead do something more like Rust and Swift do: Variables keep their static type but have pattern maching and nice syntactic sugar for simple use cases of it to make it easier to bind new variables with the narrowed type.

The main problem is that flow typing is very complex, subtle, and can fail in ways that users find surprising. For example:

    foo(Object obj) {
      closure() {
        obj = "not int any more.";
      }

      if (obj is int) {
        print(obj.abs());
      }

      closure();
    }
This function is technically safe, but it's very hard for static analysis to reliably prove what kinds of flow analysis are valid when closures come into play. If that closure can escape, it can be impossible to prove that it won't be called before the variable is used.

A simpler, more annoying example is:

    class C {
      Object obj;

      foo() {
        if (obj is int) {
          bar();
          print(obj.abs());
        }
      }

      bar() {}
    }
This code looks like it should be fine. But if C is an unsealed class and some subclass overrides `bar()` to assign to `obj`, then the promotion could fail. Because of this, Dart can't promote fields and it causes no end of user annoyance. Top-level variables and static fields have similar limitations.

Even when it works, it can be surprising:

    foo(Object obj) {
      if (obj is int) {
        var elements = [obj];
      }
    }
Should `elements` be inferred as a `List<Object>` or `List<int>`? What about here:

    foo(Object obj) {
      if (obj is int) {
        var elements = [obj];
        elements.add("not int");
      }
    }
Should that `add()` call be OK or an error?

Flow analysis is cool and feels like magic. It does the right thing 90+% of the time, but there's a lot going on under the hood that pops up in weird surprising ways sometimes.

It's probably the right thing to do if your language has already invested in an imperative style. But if you have the luxury of defining a language from scratch, I think you can get something simpler and more predictable if you make it easier to define new variables of the refined type instead of mutating the type of an existing variable.

Dart is heavily imperative, so I think flow analysis makes sense for it. Accommodating an imperative style is one of the main things that makes Dart so easy for new users to pick up, and that's an invaluable property. But I admit I envy Swift and Rust at times.


We don't hold language authors accountable, and accept mediocrity.


Is this something like GADT's?


Feels like type refinement via pattern matching on GADTs is more expressive than this


Hafiz writes:

> almost every enterprise Java codebase I've had to work with has observed this pattern of testing whether an object is an instance of particular subclass, and using that to drive further computation. Even in the presence of such tests, Java demands that the author cast their objects appropriately. Of course you can always introduce an intermediate variable after the test, but I argue that this is still too much—upon verifying the instance of an object, the type system should be smart enough to update the object's type appropriately!

Because Java considers null to inhabit every type, in a Java project a few years ago, I handled this in a dynamic_cast-like way, as follows (abbreviated):

    public abstract class Security {     // ...
        public Stock asStock()   { return null; }
        public Future asFuture() { return null; }
    }

    public class Stock extends Security { // ...
        public Stock asStock() { return this; }
    }

    public class Future extends Security {
        public final SecurityTable factory;
        public final String exchange, symbol; // ...
        public Future asFuture() { return this; }
    }
This allows you to write relatively uncluttered type-dispatching downcasting code like this:

            public boolean contains(Security sec) {
                Future f = sec.asFuture();

                return f != null
                       && SecurityTable.this == f.factory
                       && f.symbol.equals(symbol);
            }
Of course this violates the open-closed principle (the abstract base class should be closed for modification) and official OO doctrine is that if you want different behavior for different subclasses you should put that behavior into a method that gets overridden by the subclasses, not in a conditional that attempts a downcast, so we did that a lot more often. But I found it a pleasant solution to the problem in the context of this Java project.

Of course it doesn't help in languages without implicit nullability, like TypeScript, which would ideally be all statically-typed languages.

— ⁂ —

The big difficulty with flow typing is, as I see it, not that it clashes with nominal typing; it's that it incorporates your compiler's static control flow analysis capabilities into your language's type system. Consider Hafiz's Java example:

        if (node instanceof DomNode.Element) {
            Layout layout = ((DomNode.Element) node).layout;
            return new RenderNode.Styled(layout, /* ... */);
        }
        return new RenderNode.Noop(/* ... */);
It is entirely reasonable to request that the type system handle this and allow you to write:

        if (node instanceof DomNode.Element) {
            Layout layout = node.layout;
            return new RenderNode.Styled(layout, /* ... */);
        }
        return new RenderNode.Noop(/* ... */);
But how about this? Now we're no longer inside the static extent of the `if`, and we're depending on the compiler to recognize the unconditional early return:

        if (!(node instanceof DomNode.Element)) {
            return new RenderNode.Noop(/* ... */);
        }
        Layout layout = node.layout;
        return new RenderNode.Styled(layout, /* ... */);
How about constant folding and partial evaluation?

        if (1 == 0 || node instanceof DomNode.Element) {
            Layout layout = node.layout;
            return new RenderNode.Styled(layout, /* ... */);
        }
        return new RenderNode.Noop(/* ... */);
Do we want to skip type checking entirely for dead code? Straightforwardly flow typing gives us that the type of every variable inside unreachable code is void, the uninhabited type, so any operation whatsoever on it is type-valid. Do we want the compiler to accept code like this?

        if (1 == 0) {
            Layout layout = node.layout + node / node;
            return new RenderNode.Styled(layout, /* ... */);
        }
        return new RenderNode.Noop(/* ... */);
And of course doing control-flow analysis precisely isn't feasible due to the halting problem; you need to do some conservative approximation.

So, if you want your programs to be portable from one version of the compiler to the next, somewhere you need to write down precisely what conservative approximation you're using for control-flow analysis, and in particular what you're not.


In this example, it's why Java added sealed interfaces, otherwise left with ∞ - 1 possibilities.

  if (!(node instanceof DomNode.Element)) {
    return new RenderNode.Noop(/* ... */);
  }
  Layout layout = node.layout;
  return new RenderNode.Styled(layout, /* ... */);


It's true that we return a `RenderNode.Noop` in ∞ - 1 cases, but that's okay because we're not making any demands of `node` there.


I forgot about flow typing. Is that still an option on iOS?


You're thinking of swype, or whatever iOS calls it. Yes, it is built in to iOS.

This 'flow typing' is a programming topic, nothing to do with mobile keyboards.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: