The following is based on conjecture, as I'm not old or well-read enough to be s...

jarrett · on April 21, 2014

I think what you're getting at is the distinction between this:

  float x;

and this:

  kilogram x;

The former, as you say, only tells you about the underlying representation. It says "this is a float, so the computer should store it in such and such a way." That's fine, but it doesn't tell us enough.

The latter example is far more useful. In my imaginary language, the declaration implicitly tells the computer to store the value as a float, because the kilogram type has been defined as such elsewhere. But that's not all it does! It tells us and the compiler that this float represents a real-world quantity measured in kilograms. It prevents us from mistakenly passing kilograms where a pounds were expected, or seconds where kilograms were expected.

On the Haskell front, you might be interested in the Dimensional library, which does just that. It also works elegantly with multiplying and dividing units. E.g. if you have a miles value and an hours value, you can divide to get a miles per hour value.

steveklabnik · on April 21, 2014

You can even take this a step further: using `float` instead of `kilogram` is leaking an implementation detail. It's (probably) too low-level of information to actually be useful.

Luckily, more and more programming languages are making creating these simple types easier. Haskell:

    newtype Kilogram = Integer

Rust:

    struct Kilogram(int);

In fact, this blog post, while a bit outdated, shows an application of this idea, to solve string encoding issues for HTML templating: http://bluishcoder.co.nz/2013/08/15/phantom_types_in_rust.ht...

sparkie · on April 21, 2014

nitpick - the haskell version should be

    newtype Kilogram = Kilogram integer

The issue with doing this of course, is that it's mostly useless to computation. You lose the ability to use mathematical operators because you're no longer an instance of Num, and even if you create the Num instance, or use -XGeneralizedNewtypeDeriving, you can't multiply a Kilogram by a Meter/Second for example, since the arguments to (*) must be of the same type. One would need to use a generic "Measure" type instead, where the unit is some metadata attached to it, and the Num instance implements the typechecking on units.

jarrett · on April 21, 2014

You're quite right about the limitations imposed when you make your own dimensional types like that. It turns out the problem isn't trivial. Which is why I prefer to use a library like Dimensional. With Dimensional, you get plenty of units out-of-the-box, you can define your own if needed, and you can perform arithmetic in a fairly natural way.

dllthomas · on April 22, 2014

And unfortunately even with a "generic Measure type" you can't actually enforce anything at compile time. The "Num" typeclass was not very well designed (given the current facilities of the language, some of which didn't exist at the time Num was designed, so no aspersions cast at the designers!).

steveklabnik · on April 21, 2014

Whoops, thanks for the correction!

madisp · on April 21, 2014

I wouldn't ever say that using floats compared to, say, precision decimals in any language is leaking an implementation detail.

As a user of a library/API I need to know if I can rely on it being precise or not.

judk · on April 22, 2014

Float is an implementation detail. Precision is an API contract.

dllthomas · on April 22, 2014

Right. An interface exposed with integers may wind up using floats internally, or vice versa; numeric algorithms may wind up doing arbitrary things to precision and accuracy; &c. "You are passing around a float" puts some bounds on what can be delivered but that isn't really sufficient information when it matters and is useless fluff when it doesn't.

fulafel · on April 21, 2014

Floating point has dramatic differences in semantics compared to exact numbers, not an invisible implementation detail at all.

tormeh · on April 21, 2014

Unless you're counting money or doing long-running simulations it's often more than good enough, especially Double.

dllthomas · on April 22, 2014

Or doing discrete math, though you wouldn't be with a continuous quantity like Kilograms.

tutuca · on April 21, 2014

or good ol' typedef, no?

masklinn · on April 21, 2014

With a straight typedef, you can add 3 feet to 5 meters and get 8. Probably not desirable. So you've got to typedef single-field structs instead.

dllthomas · on April 22, 2014

"So you've got to typedef single-field structs instead."

Which, thankfully, don't add any runtime cost (as should be expected).

As an aside, you don't actually need to typedef them at that point, but it saves you having to write "struct ..." everywhere so it's typically worthwhile.

steveklabnik · on April 21, 2014

Sure, but typedefs aren't as strong a guarantee. They're more a textual find/replace than an actual feature of the type system.

xenomachina · on April 21, 2014

Does the Dimensional library work for weird units like Fahrenheit and Celsius (ie: where zero is in the wrong place)?

jarrett · on April 21, 2014

Celsius and Kelvin are supported.

I'm not certain, but it would appear Fahrenheit support is not built-in. There's a module called NonSI:

https://hackage.haskell.org/package/dimensional-0.13/docs/Nu...

I would have expected Fahrenheit to be in that module if it were supported.

But I presume you could add Fahrenheit yourself, if you were willing to learn a bit about the library's internals.

nagisa · on April 21, 2014

Fahrenheit is not easily addable given current realisation of functionality and, if I recall correctly, somewhere in source code there’s a comment in which author explicitly states he doesn’t support units which don’t linearly correspond to SI-based unit.

jimktrains2 · on April 21, 2014

The function to convert Fahrenheit to Celsius is a linear function of C and K?

cobbal · on April 21, 2014

Linear in this case means y = m * x. The relationship between Fahrenheit and Celsius is affine (y = m * x + b)

For gory details: https://en.wikipedia.org/wiki/Affine_transformation

judk · on April 22, 2014

Celsius is an affine transformation of Kelvin

jarrett · on April 22, 2014

Exactly, which makes me wonder why the library can support Celsius but not Fahrenheit.

To say it in more detail: To transform between Celsius and Kelvin, you just translate. (That operation is supported in Dimensional.)

To transform between Fahrenheit and Celsius or Fahrenheit and Kelvin, you translate and scale.

We know that Dimensional can scale, because it supports things like miles to kilometers. We know it can translate, because it supports Celsius to Fahrenheit. Is it the combination of scaling and transforming that makes Fahrenheit impossible?

jimktrains2 · on April 22, 2014

miles = constant * km though, you simply scale one to get the other.

f = m*c + b, which is both a scale and a transform.

jarrett · on April 22, 2014

But as stated above, Dimensional already does Celsius to Kelvin and vice-versa. Which is indeed of the form you just stated:

f = m*c + b

where m is simply equal to 1. Why wouldn't Fahrenheit conversion be the same thing, just with different constants?

jimktrains2 · on April 25, 2014

Me no reed gud :-\

Yeah, you're correct.

benaiah · on April 21, 2014

Yeah - this is exactly the kind of distinction I was trying to make. Reminds me of Abelson's famous quote, "Programs must be written for people to read, and only incidentally for machines to execute." That's an idea that's largely forgotten when it comes to type systems.

Thanks for the pointer - I'll be sure to check out the library as I dive into Haskell over the next few months.

robert_tweed · on April 21, 2014

I've had a long standing interest in this kind of semantic typing. Ironically, one of the things people hate about Java is that it has a deeply engrained culture of using semantic typing everywhere, except that it uses class "boxing" rather than just having something like the type system in Go (typedef on steroids) or something axiom based like Haskell.

So that's why so much Java code ends up looking like this:

  //class FurryThing implements Thing {}
  //class RubberThing implements Thing {}
  
  FuzzyThing furbie = new FuzzyThing();
  RubberThing bouncyball = new RubberThing();
  DoStuffToThing( Thing(furbie) );
  DoStuffToThing( Thing(bouncyball) );

In fact, it's not uncommon to see things like class Furbie extends FuzzyThing, so you end up with very specific types that don't necessary behave differently, but do add some supposed semantic precision to the code. Unfortunately, you don't get much benefit from it in Java, so it just ends up being a hassle, resulting in lots of programmer-forced unsafe casting and just as many runtime errors as you'd have got with simpler types. This is not a fault of the language, just poor implementation which rightly or wrongly appears to be idiomatic in Java.

The type system in, for example, Go is much simpler, and allows you to create semantic types that actually are the underlying type instead of creating a class wrapper (called boxing in Java, e.g., boxed int is class Integer, which is an object containing only an int), which means you can define a type like 'kilogram' that's really just a float, but the type system won't silently cast a 'kilogram' value to a float or vice-versa because it knows they are different things.

I haven't got into Haskell in much detail yet but it just basically takes that a few steps further and allows you to define the characteristics of each custom type based on axioms, that describe things like whether it's commutative or associative, the range of allowed values, etc.

This kind of axiomatic semantic typing is clearly a lot more powerful, but as yet it hasn't really found its way into mainstream languages yet. Go takes the view that it's too complicated for what is intended to be a low-level language. Rust looks interesting because it aims to be equally low level, but provide some of the same rich typing you'd find in Haskell. Unfortunately it's far from stable, but it'll be interesting to see where it ends up.

dragonwriter · on April 22, 2014

> Ironically, one of the things people hate about Java is that it has a deeply engrained culture of using semantic typing everywhere, except that it uses class "boxing" rather than just having something like the type system in Go (typedef on steroids) or something axiom based like Haskell.

I think that people hate the implementation and syntax associated with semantic type in Java, not the fact that semantic typing is widely used in Java.

I think both much of the recent golden age of dynamic languages and the more recent resurgent of cleaner, less-heavy-syntax statically typed languages has been motivated by the perception that static typing in Java (and similar languages) has too high a cost for the benefits it provides.

benaiah · on April 22, 2014

Yeah, I suppose my description of Java as compiler-enforced Hungarian notation isn't entirely fair. I often wonder if Java's problems in terms of bearability are as much syntactic as they are semantic.

skybrian · on April 21, 2014

As an alternative, check out how Go does this. It's a nice compromise that lets you declare kilogram to be a float, and when used alone it treats it as a separate type, but it doesn't get in the way when you use it in an expression.

Jtsummers · on April 21, 2014

That seems useful, but dangerous. If you specify (for some reason) both a centimeter and inches type, would you be able to mix them in expressions without warnings?

--------

Per [1] it seems that since both would be named types you wouldn't be able to assign one to the other without explicit conversions. If my reading is correct, then that addresses my concern. It's things like storing a `float` to a `inch` type that can happen without explicit conversion as long as `inch` has float as its base type.

[1] http://golang.org/ref/spec#Properties_of_types_and_values

skybrian · on April 21, 2014

They are separate types, not a type alias, but some conversions are automatic. The rule for assignability also affects comparisons and function calls (including expressions containing them), but apparently not all expressions like I thought?

Dewie · on April 21, 2014

> type Kilogram = Double

type synonym/alias.

jarrett · on April 21, 2014

In Haskell? I wouldn't do that. That improves readability, but it doesn't allow the compiler to enforce correct unit usage. If a Kilogram is just a synonym for a Double, you can pass it in where other meanings of Double are expected. E.g. you can pass Kilograms into a function that takes Pounds. Which you don't want to do.

jweese · on April 21, 2014

That's why you want

    newtype Kilogram = Kilogram Double

which has no runtime overhead but does enforce the fact the Kilogram is a distinct type.

sparkie · on April 21, 2014

We probably wanna add to it to make it useful

    {-# LANGUAGE GeneralizedNewtypeDeriving #-}
   
    newtype Kilogram = Kilogram Double deriving (Show, Eq, Ord, Num, Real, Fractional, Floating, ...)

And remind ourselves that Haskell is really quite expressive.

dllthomas · on April 22, 2014

Though unfortunately can't handle (mass * accel :: Newton) without swapping out the Prelude.

antocv · on April 21, 2014

Ada does it all.

Dewie · on April 21, 2014

Yeah no kidding. I brought it up to show that you can have the same kind of thing that skybrian was talking about in Haskell.

skybrian · on April 21, 2014

My point was that named types in Go are not type aliases like in other languages.

kirab · on April 22, 2014

In C++14 this got easier.

Declaration:

  struct Kilogram {float v;};

Instantiation:

  Kilogram x = {1};

or

  Kilogram x{1};

pacaro · on April 21, 2014

F# has this as units of measure [1]

[1] http://msdn.microsoft.com/en-us/library/dd233243.aspx

coolsunglasses · on April 21, 2014

I've been teaching a bunch of Haskell lately, this is my guide for learning it:

https://gist.github.com/bitemyapp/8739525

And as the other commenter mentioned, Dimensional is indeed a cool library :)