Hacker News new | past | comments | ask | show | jobs | submit login
Scala’s Types of Types (ktoso.github.io)
120 points by ddispaltro on March 15, 2016 | hide | past | favorite | 49 comments



> A major difference from Java in Scala is, that container types are not-variant by default!

Only arrays are variant "by default" in java (which was a huge mistake). Variance for generic collections is specified at use, not at the type definition level.

> This means that if you have a container defined as Box[A], and then use it with a Fruit in place of the type parameter A, you will not be able to insert an Apple (which IS-A Fruit) into it.

No. What this really means is that you can't assign an instance of Box[Apple] to a val/var of type Box[Fruit].


The last part makes sense, no? If you could do

List<Fruit> fruits = apples

Then you could add an orange to the fruits list, and it would appear among the apples.

It doesn't seem like you can fix this without immutability?


The problem with what Java calls a `List` is that it really isn't a list. It's a mutable cell, whose contents at any given point in time is a list. The methods of the mutable cell let you replace the list with a new one, which happens to be partially based on the old one. Also, because Java's data structures are usually ephemeral, the new list can be created in the same place where the old list was.

A type constructor of lists (which Java's standard library doesn't have to the best of my knowledge) is covariant on its element type, but a type constructor of mutable cells is invariant on its content type.


If you really did have a case where your list is a list of apples, wouldn't you type:

List<Apple> apples = aps

If you're using a general type Fruit, and can't stand any fruit that isn't an apple, you probably need to use the more specific Apple type.


I mean, the problem of assigning `fruits = aps` is not that you get apples in your fruit list, but that now elements added to fruit (not of apple type) also get added to `aps`.

The assignment sort of goes both ways like that.


Ah, your point makes more sense now. Immutability would help a lot here, it seems.


The following has always had me a bit puzzled, be interested in responses.

If bottom type corresponds to no(thing) and top to any(thing) where does every(thing) and some(thing) fit in? Is there any semantic distinction between any(thing) and every(thing)? There should be, right? For me, some(thing) that had type any(thing) would be an _individual_ but some(thing) that had type every(thing) {actually the mind boggles a bit here} would be a _colection_, the collection of all existing or living things. In fact Ruby has the notion of Objectspace. So we can say things like ObjectSpace.each_object{|x| p x}


The any type is an "or" of all your types. Any value in your type system belongs to the top type. The bottom type is a type for which no existing value belongs to this type.

What would an "everything" type be? I'd guess the "and" of all your types, meaning "any value that belongs to all (non-bottom?) types" (e.g. an imaginary 0 symbol meaning zero float, empty list, empty string according to the context). If you want your "every" type to consider bottom type, then the intersection is empty.

Likewise, the "some" type would be a type containing values that belong to at least one type, which IMO looks like the "any" type.


An "everthing" type is impossible. It would not only have to be a float, a list, a string, but also a tuple, a monad, and everything else. So 0 might work as an empty string, and 1 might work as a string, but 1 wouldn't work as a dictionary at all, or as a monad, or...


Well, existential types are pretty similar. If you've got (exists a . a) then you have to be prepared for it to be a float, but also prepared for it to be a list, or a string, but also a tuple, some type for which we only know that it is a monad, and everything else :)

More specifically, we've lost any information to tell us what specific thing might be here so we have to be prepared for any answer. It's the inverse of (forall a . a) which we can ask to become whatever we like.


In that case the everything type is just bottom.

(edit: to be clear, the bottom type, aka zero/empty type).


But bottom is "nothing" and how can "everything" be "nothing"? But then again, should "everything" contain (is that even the right word?) all things, each and every thing, including "nothing"?


"Everything" is the intersection of all types (rather than the union, which is "Anything").

It does not means that it contains the values that all types contain. Rather, it contains the values that belong to every type. In other words, in order to belong to the type "Everything," a value has to simultaneously belong to every other type.

One way to think about this is that the name applies to each member of the type, not to the type as a whole. "Anything" is an accurate description of any possible value, so the set contains all values. "Everything" does not apply to the entire set, but to each individual element. An element must be everything all at once in order for us to describe it as "Everything." Nothing does that, so "Nothing" does that.


This confusion is why explicitly using logical quantifiers is preferable: `exists x. x` is (isomorphic to) the top type (if there is one), and `forall x. x` is (isomorphic to) the bottom type (if there is one).

Avoid natural language like the plague.


Bottom's not a type; it's a value that inhabits all types.


Some type systems also have a bottom type, and this has nothing to do with the bottom you're talking about.


If you consider subtyping, Bottom is a type IIRC, In the category with types as objects and subtyping relationship (a->b if b is subtype of a) as morphisms, Any is the initial object and Bottom is the final object of the category


I think normally one considers the opposite category, where arrows go from subtypes to supertypes, because it can be embedded in the usual category of types and functions, by considering each arrow as the upcasting function.


And what is the type of that value?


Yes, I've seen Any (or top) described like that. It is the super-type of all types. Therefore, as you say, "Any value in your type system belongs to the top type". Imagine the following:

   foo = Sheep.new         => a sheep
   foo.is_a(Any)           => true (for anything! by the nature of Any)
I agree that "some" looks like "any". I wonder is there any (ahem) semantic difference in the context we are considering.

I can see why you would say that (every)thing is the "and" of all my types. But consider that I think it obvious that (every)thing should cover all things, this would include instantiations of types as well as the types themselves.

Programming environment-wise I'm having a hard time imagining how "everything" would work. It must be related to for_all or universal quantification somehow...


"Programming environment-wise I'm having a hard time imagining how "everything" would work."

It may be worth taking a moment to skip out on the abstraction and look more at the representation layer. In the general case, in most if not all implementations, "Any" basically translates to a "Variant" type: https://en.wikipedia.org/wiki/Variant_type

"Dynamically-typed languages" are basically those languages where all variables have the type Any. (The values generally have specific types, but the variables are all Any.)


> But consider that I think it obvious that (every)thing should cover all things, this would include instantiations of types as well as the types themselves.

Types are there to describe the values that can be manipulated by your program. So "as well as the types themselves" is only a concern if your language can manipulate types as first-class values. And this is not so different from all the other types in your language, at this point.

What property should a value hold to belong to the "everything" type, then?


I think you are thinking a group of things as a group of types. Those two concepts are separate. A group of things has its own type which is not necessarily related to the type of the things in that group.

If you want an Everything type that is different from the Any type, the only type I can think of would be a type whose instances behave the same as every type. That doesn't make a lot sense to me.


> If you want an Everything type that is different from the Any type, the only type I can think of would be a type whose instances behave the same as every type. That doesn't make a lot sense to me.

I know, right?

For me, everything semantically would have to include all types and all instances of types. Types are things also after all. In type theory collections of types are called universes. I'm not referring to those (I think). Everything encompasses all things, not just all types. Also, would everything include itself? Giving us Russell's paradox, which type theory seeks to avoid.

We ought to be able to talk about all things. And everything is an alias for the phrase "all things".

I agree that Any is kind of (exactly) like void* in C or BasicObject in Ruby.


Well, I understand what you mean with void* but as a type, void* is not a supertype of all other types.


One of the reasons we need these precise terms is that English is imprecise, and inadequate for this sort of thing. A more useful analogy is the one with formal logic, which can be made precise (Curry-Howard). I'd advise looking that up if you're interested in this sort of thing.

To sort-of approach your questions, ideas that you might find interesting are an unbound existential type T forSome { type T }, and also the fact that Nothing <: T for any type T (this is the principle of explosion in logic)


You might enjoy this exchange on the topic https://groups.google.com/forum/m/#!msg/scala-debate/vysv97J...


> if (false) 23 else null

Why isn't the type of this expression Null?


Because the type "Any" is the least upper bound of Int and Null. http://ktoso.github.io/scala-types-of-types/#unified-type-sy...


I guess GP's point is that it can be statically determined that the given expression evaluates to null.


To support this wouldn't you have to encode in the language spec the exact set of constant folding optimisations applied? Just 'false' is a trivial example, but maybe your compiler can see that 'true && false' is false, but mine can't? What about if the expression is some massive operation that could theoretically be found to be constant, but no real compiler is actually going to do that.

You'd have different compilers determining different types based on optimisation which are supposed to be transparent. Doesn't seem desirable to me.


Not necessarily, it could also be a result of encoding literals as types. If true is the unit value of True a subtype of Bool you could probably have the typsystem figure out which overload to look at (True,False or Bool)

From my understanding DOT could turn out to be quite similar to this pure subtype approach http://lambda-the-ultimate.org/node/4835 so isn't that farfetched either.


Besides, I would say (in my experience) having 'false' like this will tend to occur when debugging, to force a certain execution path.

In this case it is undesirable for the compiler/model to change the static type since that may then let you make other changes which you should'nt be making, or cause compiler errors further down.


This is how the existing type checker works, from primitive types and combination rules. Some rules are more precise than others, though. I think in Scala the type system does not depend on the implementation, but it is specified by the language.


But 'false' isn't a type, it's just a value. The Scala type system doesn't encode boolean values as far as I know (not an expert).


Well, sure, but the over-approximation is surprising given that false is constant and "if" is supposedly a well-known construct.


What you're suggesting is essentially combining some sort of partial evaluation with type-checking. This is going to be hard to do in a consistent, intuitive way: either you only do a very small number of obvious cases (like if (false)) which is not all that useful and just adds complexity or you try to do it in a principled fashion which is difficult and still inherently limited.

Right now, the typing rule for if in Scala is reasonably simple. The condition is a boolean and the whole expression is the least upper bound of the left branch and the right branch. How much more complex would this rule have to be if it needed to fully specify when the condition gets simplified?

To be sure, your idea is interesting and could probably be done well in another language. I wouldn't be surprised if this sort of capability naturally comes up in some dependently typed systems. But it wouldn't work very well in Scala: you'd be adding a lot of complexity for relatively little gain.


I think I can understand if Scala does not want to consider this kind of constant folding: after all, in most code you are supposed to have the same type on the "then" and "else" branch (in OCaml for example you don't have the possibility, as far as I know, to reproduce this case).

I was surprised because I though it is quite a standard analysis in compilers nowadays. I had in mind SBCL precise type-checking (Common Lisp). Consider for example:

    (describe (lambda (x) (if nil x nil)))
    
    #<FUNCTION (LAMBDA (X)) {1005F6E9FB}>
    [compiled function]

    Lambda-list: (X)
    Derived type: (FUNCTION (T) (VALUES NULL &OPTIONAL))
The derived type is NULL (which contains only NIL). And, more importantly, the compiler signals the presence of dead code, which is helpful for finding typos. It can handle more complex situations where the mistake is not obvious. The type system supports interval types like "(integer low high)":

   (describe (lambda (x) (if x 3 2)))
   #<FUNCTION (LAMBDA (X)) {100613759B}>
     [compiled function]

   Lambda-list: (X)
   Derived type: (FUNCTION (T) (VALUES (INTEGER 2 3) &OPTIONAL))
As well as set operations on types:

   (describe (lambda (x) (if x 2 4)))
   #<FUNCTION (LAMBDA (X)) {10062F5C8B}>
     [compiled function]

   Lambda-list: (X)
   Derived type: (FUNCTION (T)
                  (VALUES (OR (INTEGER 2 2) (INTEGER 4 4)) &OPTIONAL))
So it is definitely doable, and even though it is still limited, it is less limited and quite useful.


I'm not too familiar with SBCL but I highly doubt that those are static types. This sort of information is pretty ascertainable if you really want it, but it's going to be rough going doing non-runtime checking over types this rich.


There are declared types and derived types, which are obtained by type inference and propagation. As much as possible, those types are used by the compiler, at compile-time, to signal type errors that are bound to happen at runtime, given the known types.

But the code is robust and you can give a string to a function that is declared to accept a number (e.g. from the REPL): your program won't crash, because types are checked at runtime too (you have an exception).

So, with default safety and speed optimization levels, adding type declarations will actually add runtime checks. But if you say "trust me, this will be a fixnum, produce some efficient code", then the type discrimination that would happen at runtime is bypassed and you directly emit the opcodes related to your type and this is how you optimize your code. For example, requesting a high-level of speed and low safety/debug will report a lot of optimization notes from SBCL, like: "forced to do GENERIC-+ instead of FIXNUM-+ because of type uncertainty".

http://www.lispforum.com/viewtopic.php?f=2&t=191

http://www.xach.com/sbcl/doc/x282.html

http://www.pvk.ca/Blog/2013/11/22/the-weaknesses-of-sbcls-ty...


You can do ifs at the type level in Scala, using dependent types. The difference is that Scala draws a clear distinction between the value level and the type level. This was, after all, the original point of types (in the typed lambda calculus sense).


thats a very slippery slope.

  - (0 < 1) is a constant.
  - ((10 + 10)*15 > 0) is a constant.
where do you draw the line in the type checker at working these things out?


Exactly. The type checker does track constants (`false` internally has type `Boolean(false)`), but it won't drop `if` branches. That happens later, as you can see in the (abridged) output of `scalac -Xprint:all -Xprint-types`:

  [[syntax trees at end of                     typer]] // constif.scala
  class ConstIf extends scala.AnyRef {
    def m: Any = if (false{Boolean(false)})
      23{Int(23)}
    else
      null{Null(null)}{Any}
  }

  [[syntax trees at end of                 refchecks]] // constif.scala
  class ConstIf extends scala.AnyRef {
    def m: Any = null{Null(null)}
  }


Probably somewhere before "if (halts(f,x))".


Related to sibling comment, is the halting problem an issue if you just allow expressions without recursion/loops? Along the lines of constexpr in C++.


This generally falls under the purview of optimizations, which are executed after type checking. In most languages type checking has little to do with the runtime values and everything to do with types. The existence of a bottom type via exceptions and returning is somewhat an exception to this, sure, but that still doesn't involve picking execution branches.

I think you can actually have stuff like this make sense in dependent type systems, though.


Calling those “types of types” is highly misleading. The term “type of types” has a very precise technical meaning: https://ncatlab.org/nlab/show/type+of+types


Also, kinds can be viewed as the "types" of types, right?


Sure. If your hierarchy (values, types, etc.) stops at kinds, I'd rather call them kinds, but even TAPL informally explains kinds as “the types of types” (p. 441, quotation marks in the original).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: