Hacker News new | past | comments | ask | show | jobs | submit login

Union and intersection types make the following assumptions:

(0) There is a single physical universe that contains all values.

(1) The role of types is to “carve out” fragments of this universe.

(2) Say you have two values `x` and `y` of types `A` and `B`. Because it is possible to form the union type `A | B`, to which both `x` and `y` belong, the only way to make sure `x` and `y` cannot be conflated with each other is to make them physically distinct elements of the universe.

On the other hand, abstract types make the following assumptions:

(0) Each type is a “world” of its own, defined in terms of its logical structure. The values of a type are a mere consequence of this logical structure. Hence, it doesn't make sense to try to “split” or “merge” types.

(1) It doesn't make sense to argue that an int is different from a string, because it's just as nonsensical as arguing that they are equal. What they are is incommensurable.

(2) Say you have type two different abstract types `A` and `B`. Because they could share an underlying representation type `T`, you cannot rely on values of different types having physically distinct representations.

---

Suppose we had a language with both union and abstract types. The contradictions between the assumptions behind union and abstract types would lead to hilarious consequences:

    signature ERROR_CODE =
    sig
      type error_code
      
      (* specification of operations on error codes *)
    end
    
    (* we use opaque ascription, so that the internal representation
     * of error codes becomes inaccessible to external parties *)
    structure ErrorCode :> ERROR_CODE =
    struct
      type error_code = int  (* prosaic representation *)
      
      (* implementation of operations on error codes *)
    end
    
    type union = error_code | int    (* union *)
    
    (* suppose we also have flow typing so that this example works *)
    fun test (u : union) =
      if u is error_code then 
        ErrorCode.printErrMsg u
      else
        printInt u
    
    val _ = test 42
You'd expect this code to print the integer `42`, but it actually prints:

    ERR_DOUGLAS_ADAMS : The answer to the Ultimate Question of Life,
      the Universe and Everything is the number forty-two.
Wasn't the fact error codes are internally represented as ints supposed to be unknowable to external parties?



Ah, I thought you meant that my construction would behave differently from a sum type if abstract types were involved, but it appears that you are actually worried that having non-disjoint unions available breaks abstract types by revealing their implementation details.

I agree that being able to change the internals of a module without breaking external code is a good thing, but unlike other security guarantees, abstract types can break merely by having a stronger type system which is able to prove strictly more things (such as whether two types are equal). That makes implementing completely leak-free abstractions quite difficult.

It can even be hard to tell which language feature exactly causes the leak. In your example, it is not just the existence of the union type that causes the problem, but the operation "is T", which you probably assume to be of type "forall a. (a | T) -> Bool" or something. Obviously, that operation is doing all the work of looking at the underlying implementation and leaking it.

While you couldn't meaningfully type such an operation without union types, having unions in your type system does not mean that such a function will be available. Although working with unions without being able to tell the types apart would be quite inconvenient, you could nonetheless remove that ability from the language, either completely (effectively requiring all unions to be disjoint) or by having special markers to enable or disable it (similar to Haskell's type class constraints).

Flow doesn't go to such great lengths because it is still built on JavaScript objects, which don't really have a concept of privacy. If there were private symbols (i.e. unique properties which can only be accessed by the module which defines them), you could trivially implement abstract types similar to Haskell's newtype wrapper as { [some private symbol]: implementation-type }. Until such a thing is added, you'll have to limit yourself to disjoint unions to avoid abstraction leaks.


> abstract types can break merely by having a stronger type system which is able to prove strictly more things (such as whether two types are equal)

Indeed, which is why I'm not such a huge fan of overly powerful type systems.

> That makes implementing completely leak-free abstractions quite difficult.

It means the type system has to have the right amount of expressive power. Too little, and separation of concerns becomes difficult. Too much, and separation of concerns becomes difficult too!

It also means that we can't rely on types to prove everything we would like to prove about programs. This seems intuitively right, because type systems are first and foremost meant to be mechanically checkable, not to be flexible enough to express every possible software requirement.

> In your example, it is not just the existence of the union type that causes the problem, but the operation "is T", which you probably assume to be of type "forall a. (a | T) -> Bool" or something.

This is correct. But removing this `is T` operation would greatly reduce the appeal of union types in practice.

> you could trivially implement abstract types similar to Haskell's newtype wrapper as { [some private symbol]: implementation-type }.

Both Haskell's `newtype` and your proposal are poor replacements for actual abstract types.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: