The following is based on conjecture, as I'm not old or well-read enough to be sure of this, but it seems to me that the original purpose of type systems got muddled up by the tremendous popularity, mostly in Windows systems, of "hungarian" variable naming style. When your "type system" consists of adding three letters to the beginning of a variable name, you don't have a way to make descriptive types. Java, in most of the ways I've seen it used, is essentially hungarian notation enforced as a language feature.
I realize that variable naming and type systems are two different things, but it seems many programmers never realized the point of type systems (expressing the type for the programmers sake) because they only ever saw it used to distinguish abstract primitives. For a long time, I had trouble understanding that hungarian notation wasn't a type system, because they seemed to do precisely the same thing - that's how limited my understanding of types was. The kind of style seen in this article was alien to me for a long time, but it was enlightening to realize that the type system is there to help me out, not just make me type a bunch of unnecessary crap.
I think what you're getting at is the distinction between this:
float x;
and this:
kilogram x;
The former, as you say, only tells you about the underlying representation. It says "this is a float, so the computer should store it in such and such a way." That's fine, but it doesn't tell us enough.
The latter example is far more useful. In my imaginary language, the declaration implicitly tells the computer to store the value as a float, because the kilogram type has been defined as such elsewhere. But that's not all it does! It tells us and the compiler that this float represents a real-world quantity measured in kilograms. It prevents us from mistakenly passing kilograms where a pounds were expected, or seconds where kilograms were expected.
On the Haskell front, you might be interested in the Dimensional library, which does just that. It also works elegantly with multiplying and dividing units. E.g. if you have a miles value and an hours value, you can divide to get a miles per hour value.
You can even take this a step further: using `float` instead of `kilogram` is leaking an implementation detail. It's (probably) too low-level of information to actually be useful.
Luckily, more and more programming languages are making creating these simple types easier. Haskell:
The issue with doing this of course, is that it's mostly useless to computation. You lose the ability to use mathematical operators because you're no longer an instance of Num, and even if you create the Num instance, or use -XGeneralizedNewtypeDeriving, you can't multiply a Kilogram by a Meter/Second for example, since the arguments to (*) must be of the same type. One would need to use a generic "Measure" type instead, where the unit is some metadata attached to it, and the Num instance implements the typechecking on units.
You're quite right about the limitations imposed when you make your own dimensional types like that. It turns out the problem isn't trivial. Which is why I prefer to use a library like Dimensional. With Dimensional, you get plenty of units out-of-the-box, you can define your own if needed, and you can perform arithmetic in a fairly natural way.
And unfortunately even with a "generic Measure type" you can't actually enforce anything at compile time. The "Num" typeclass was not very well designed (given the current facilities of the language, some of which didn't exist at the time Num was designed, so no aspersions cast at the designers!).
Right. An interface exposed with integers may wind up using floats internally, or vice versa; numeric algorithms may wind up doing arbitrary things to precision and accuracy; &c. "You are passing around a float" puts some bounds on what can be delivered but that isn't really sufficient information when it matters and is useless fluff when it doesn't.
"So you've got to typedef single-field structs instead."
Which, thankfully, don't add any runtime cost (as should be expected).
As an aside, you don't actually need to typedef them at that point, but it saves you having to write "struct ..." everywhere so it's typically worthwhile.
Fahrenheit is not easily addable given current realisation of functionality and, if I recall correctly, somewhere in source code there’s a comment in which author explicitly states he doesn’t support units which don’t linearly correspond to SI-based unit.
Exactly, which makes me wonder why the library can support Celsius but not Fahrenheit.
To say it in more detail: To transform between Celsius and Kelvin, you just translate. (That operation is supported in Dimensional.)
To transform between Fahrenheit and Celsius or Fahrenheit and Kelvin, you translate and scale.
We know that Dimensional can scale, because it supports things like miles to kilometers. We know it can translate, because it supports Celsius to Fahrenheit. Is it the combination of scaling and transforming that makes Fahrenheit impossible?
Yeah - this is exactly the kind of distinction I was trying to make. Reminds me of Abelson's famous quote, "Programs must be written for people to read, and only incidentally for machines to execute." That's an idea that's largely forgotten when it comes to type systems.
Thanks for the pointer - I'll be sure to check out the library as I dive into Haskell over the next few months.
I've had a long standing interest in this kind of semantic typing. Ironically, one of the things people hate about Java is that it has a deeply engrained culture of using semantic typing everywhere, except that it uses class "boxing" rather than just having something like the type system in Go (typedef on steroids) or something axiom based like Haskell.
So that's why so much Java code ends up looking like this:
In fact, it's not uncommon to see things like class Furbie extends FuzzyThing, so you end up with very specific types that don't necessary behave differently, but do add some supposed semantic precision to the code. Unfortunately, you don't get much benefit from it in Java, so it just ends up being a hassle, resulting in lots of programmer-forced unsafe casting and just as many runtime errors as you'd have got with simpler types. This is not a fault of the language, just poor implementation which rightly or wrongly appears to be idiomatic in Java.
The type system in, for example, Go is much simpler, and allows you to create semantic types that actually are the underlying type instead of creating a class wrapper (called boxing in Java, e.g., boxed int is class Integer, which is an object containing only an int), which means you can define a type like 'kilogram' that's really just a float, but the type system won't silently cast a 'kilogram' value to a float or vice-versa because it knows they are different things.
I haven't got into Haskell in much detail yet but it just basically takes that a few steps further and allows you to define the characteristics of each custom type based on axioms, that describe things like whether it's commutative or associative, the range of allowed values, etc.
This kind of axiomatic semantic typing is clearly a lot more powerful, but as yet it hasn't really found its way into mainstream languages yet. Go takes the view that it's too complicated for what is intended to be a low-level language. Rust looks interesting because it aims to be equally low level, but provide some of the same rich typing you'd find in Haskell. Unfortunately it's far from stable, but it'll be interesting to see where it ends up.
> Ironically, one of the things people hate about Java is that it has a deeply engrained culture of using semantic typing everywhere, except that it uses class "boxing" rather than just having something like the type system in Go (typedef on steroids) or something axiom based like Haskell.
I think that people hate the implementation and syntax associated with semantic type in Java, not the fact that semantic typing is widely used in Java.
I think both much of the recent golden age of dynamic languages and the more recent resurgent of cleaner, less-heavy-syntax statically typed languages has been motivated by the perception that static typing in Java (and similar languages) has too high a cost for the benefits it provides.
Yeah, I suppose my description of Java as compiler-enforced Hungarian notation isn't entirely fair. I often wonder if Java's problems in terms of bearability are as much syntactic as they are semantic.
As an alternative, check out how Go does this. It's a nice compromise that lets you declare kilogram to be a float, and when used alone it treats it as a separate type, but it doesn't get in the way when you use it in an expression.
That seems useful, but dangerous. If you specify (for some reason) both a centimeter and inches type, would you be able to mix them in expressions without warnings?
--------
Per [1] it seems that since both would be named types you wouldn't be able to assign one to the other without explicit conversions. If my reading is correct, then that addresses my concern. It's things like storing a `float` to a `inch` type that can happen without explicit conversion as long as `inch` has float as its base type.
They are separate types, not a type alias, but some conversions are automatic. The rule for assignability also affects comparisons and function calls (including expressions containing them), but apparently not all expressions like I thought?
In Haskell? I wouldn't do that. That improves readability, but it doesn't allow the compiler to enforce correct unit usage. If a Kilogram is just a synonym for a Double, you can pass it in where other meanings of Double are expected. E.g. you can pass Kilograms into a function that takes Pounds. Which you don't want to do.
I realize that variable naming and type systems are two different things, but it seems many programmers never realized the point of type systems (expressing the type for the programmers sake) because they only ever saw it used to distinguish abstract primitives. For a long time, I had trouble understanding that hungarian notation wasn't a type system, because they seemed to do precisely the same thing - that's how limited my understanding of types was. The kind of style seen in this article was alien to me for a long time, but it was enlightening to realize that the type system is there to help me out, not just make me type a bunch of unnecessary crap.
tl;dr: I really need to learn Haskell.