Let the Type System do the Work

benaiah · on April 21, 2014

The following is based on conjecture, as I'm not old or well-read enough to be sure of this, but it seems to me that the original purpose of type systems got muddled up by the tremendous popularity, mostly in Windows systems, of "hungarian" variable naming style. When your "type system" consists of adding three letters to the beginning of a variable name, you don't have a way to make descriptive types. Java, in most of the ways I've seen it used, is essentially hungarian notation enforced as a language feature.

I realize that variable naming and type systems are two different things, but it seems many programmers never realized the point of type systems (expressing the type for the programmers sake) because they only ever saw it used to distinguish abstract primitives. For a long time, I had trouble understanding that hungarian notation wasn't a type system, because they seemed to do precisely the same thing - that's how limited my understanding of types was. The kind of style seen in this article was alien to me for a long time, but it was enlightening to realize that the type system is there to help me out, not just make me type a bunch of unnecessary crap.

tl;dr: I really need to learn Haskell.

jarrett · on April 21, 2014

I think what you're getting at is the distinction between this:

  float x;

and this:

  kilogram x;

The former, as you say, only tells you about the underlying representation. It says "this is a float, so the computer should store it in such and such a way." That's fine, but it doesn't tell us enough.

The latter example is far more useful. In my imaginary language, the declaration implicitly tells the computer to store the value as a float, because the kilogram type has been defined as such elsewhere. But that's not all it does! It tells us and the compiler that this float represents a real-world quantity measured in kilograms. It prevents us from mistakenly passing kilograms where a pounds were expected, or seconds where kilograms were expected.

On the Haskell front, you might be interested in the Dimensional library, which does just that. It also works elegantly with multiplying and dividing units. E.g. if you have a miles value and an hours value, you can divide to get a miles per hour value.

steveklabnik · on April 21, 2014

You can even take this a step further: using `float` instead of `kilogram` is leaking an implementation detail. It's (probably) too low-level of information to actually be useful.

Luckily, more and more programming languages are making creating these simple types easier. Haskell:

    newtype Kilogram = Integer

Rust:

    struct Kilogram(int);

In fact, this blog post, while a bit outdated, shows an application of this idea, to solve string encoding issues for HTML templating: http://bluishcoder.co.nz/2013/08/15/phantom_types_in_rust.ht...

sparkie · on April 21, 2014

nitpick - the haskell version should be

    newtype Kilogram = Kilogram integer

The issue with doing this of course, is that it's mostly useless to computation. You lose the ability to use mathematical operators because you're no longer an instance of Num, and even if you create the Num instance, or use -XGeneralizedNewtypeDeriving, you can't multiply a Kilogram by a Meter/Second for example, since the arguments to (*) must be of the same type. One would need to use a generic "Measure" type instead, where the unit is some metadata attached to it, and the Num instance implements the typechecking on units.

jarrett · on April 21, 2014

You're quite right about the limitations imposed when you make your own dimensional types like that. It turns out the problem isn't trivial. Which is why I prefer to use a library like Dimensional. With Dimensional, you get plenty of units out-of-the-box, you can define your own if needed, and you can perform arithmetic in a fairly natural way.

dllthomas · on April 22, 2014

And unfortunately even with a "generic Measure type" you can't actually enforce anything at compile time. The "Num" typeclass was not very well designed (given the current facilities of the language, some of which didn't exist at the time Num was designed, so no aspersions cast at the designers!).

steveklabnik · on April 21, 2014

Whoops, thanks for the correction!

madisp · on April 21, 2014

I wouldn't ever say that using floats compared to, say, precision decimals in any language is leaking an implementation detail.

As a user of a library/API I need to know if I can rely on it being precise or not.

judk · on April 22, 2014

Float is an implementation detail. Precision is an API contract.

dllthomas · on April 22, 2014

Right. An interface exposed with integers may wind up using floats internally, or vice versa; numeric algorithms may wind up doing arbitrary things to precision and accuracy; &c. "You are passing around a float" puts some bounds on what can be delivered but that isn't really sufficient information when it matters and is useless fluff when it doesn't.

fulafel · on April 21, 2014

Floating point has dramatic differences in semantics compared to exact numbers, not an invisible implementation detail at all.

tormeh · on April 21, 2014

Unless you're counting money or doing long-running simulations it's often more than good enough, especially Double.

dllthomas · on April 22, 2014

Or doing discrete math, though you wouldn't be with a continuous quantity like Kilograms.

tutuca · on April 21, 2014

or good ol' typedef, no?

masklinn · on April 21, 2014

With a straight typedef, you can add 3 feet to 5 meters and get 8. Probably not desirable. So you've got to typedef single-field structs instead.

dllthomas · on April 22, 2014

"So you've got to typedef single-field structs instead."

Which, thankfully, don't add any runtime cost (as should be expected).

As an aside, you don't actually need to typedef them at that point, but it saves you having to write "struct ..." everywhere so it's typically worthwhile.

steveklabnik · on April 21, 2014

Sure, but typedefs aren't as strong a guarantee. They're more a textual find/replace than an actual feature of the type system.

xenomachina · on April 21, 2014

Does the Dimensional library work for weird units like Fahrenheit and Celsius (ie: where zero is in the wrong place)?

jarrett · on April 21, 2014

Celsius and Kelvin are supported.

I'm not certain, but it would appear Fahrenheit support is not built-in. There's a module called NonSI:

https://hackage.haskell.org/package/dimensional-0.13/docs/Nu...

I would have expected Fahrenheit to be in that module if it were supported.

But I presume you could add Fahrenheit yourself, if you were willing to learn a bit about the library's internals.

nagisa · on April 21, 2014

Fahrenheit is not easily addable given current realisation of functionality and, if I recall correctly, somewhere in source code there’s a comment in which author explicitly states he doesn’t support units which don’t linearly correspond to SI-based unit.

jimktrains2 · on April 21, 2014

The function to convert Fahrenheit to Celsius is a linear function of C and K?

cobbal · on April 21, 2014

Linear in this case means y = m * x. The relationship between Fahrenheit and Celsius is affine (y = m * x + b)

For gory details: https://en.wikipedia.org/wiki/Affine_transformation

judk · on April 22, 2014

Celsius is an affine transformation of Kelvin

jarrett · on April 22, 2014

Exactly, which makes me wonder why the library can support Celsius but not Fahrenheit.

To say it in more detail: To transform between Celsius and Kelvin, you just translate. (That operation is supported in Dimensional.)

To transform between Fahrenheit and Celsius or Fahrenheit and Kelvin, you translate and scale.

We know that Dimensional can scale, because it supports things like miles to kilometers. We know it can translate, because it supports Celsius to Fahrenheit. Is it the combination of scaling and transforming that makes Fahrenheit impossible?

jimktrains2 · on April 22, 2014

miles = constant * km though, you simply scale one to get the other.

f = m*c + b, which is both a scale and a transform.

jarrett · on April 22, 2014

But as stated above, Dimensional already does Celsius to Kelvin and vice-versa. Which is indeed of the form you just stated:

f = m*c + b

where m is simply equal to 1. Why wouldn't Fahrenheit conversion be the same thing, just with different constants?

jimktrains2 · on April 25, 2014

Me no reed gud :-\

Yeah, you're correct.

benaiah · on April 21, 2014

Yeah - this is exactly the kind of distinction I was trying to make. Reminds me of Abelson's famous quote, "Programs must be written for people to read, and only incidentally for machines to execute." That's an idea that's largely forgotten when it comes to type systems.

Thanks for the pointer - I'll be sure to check out the library as I dive into Haskell over the next few months.

robert_tweed · on April 21, 2014

I've had a long standing interest in this kind of semantic typing. Ironically, one of the things people hate about Java is that it has a deeply engrained culture of using semantic typing everywhere, except that it uses class "boxing" rather than just having something like the type system in Go (typedef on steroids) or something axiom based like Haskell.

So that's why so much Java code ends up looking like this:

  //class FurryThing implements Thing {}
  //class RubberThing implements Thing {}
  
  FuzzyThing furbie = new FuzzyThing();
  RubberThing bouncyball = new RubberThing();
  DoStuffToThing( Thing(furbie) );
  DoStuffToThing( Thing(bouncyball) );

In fact, it's not uncommon to see things like class Furbie extends FuzzyThing, so you end up with very specific types that don't necessary behave differently, but do add some supposed semantic precision to the code. Unfortunately, you don't get much benefit from it in Java, so it just ends up being a hassle, resulting in lots of programmer-forced unsafe casting and just as many runtime errors as you'd have got with simpler types. This is not a fault of the language, just poor implementation which rightly or wrongly appears to be idiomatic in Java.

The type system in, for example, Go is much simpler, and allows you to create semantic types that actually are the underlying type instead of creating a class wrapper (called boxing in Java, e.g., boxed int is class Integer, which is an object containing only an int), which means you can define a type like 'kilogram' that's really just a float, but the type system won't silently cast a 'kilogram' value to a float or vice-versa because it knows they are different things.

I haven't got into Haskell in much detail yet but it just basically takes that a few steps further and allows you to define the characteristics of each custom type based on axioms, that describe things like whether it's commutative or associative, the range of allowed values, etc.

This kind of axiomatic semantic typing is clearly a lot more powerful, but as yet it hasn't really found its way into mainstream languages yet. Go takes the view that it's too complicated for what is intended to be a low-level language. Rust looks interesting because it aims to be equally low level, but provide some of the same rich typing you'd find in Haskell. Unfortunately it's far from stable, but it'll be interesting to see where it ends up.

dragonwriter · on April 22, 2014

> Ironically, one of the things people hate about Java is that it has a deeply engrained culture of using semantic typing everywhere, except that it uses class "boxing" rather than just having something like the type system in Go (typedef on steroids) or something axiom based like Haskell.

I think that people hate the implementation and syntax associated with semantic type in Java, not the fact that semantic typing is widely used in Java.

I think both much of the recent golden age of dynamic languages and the more recent resurgent of cleaner, less-heavy-syntax statically typed languages has been motivated by the perception that static typing in Java (and similar languages) has too high a cost for the benefits it provides.

benaiah · on April 22, 2014

Yeah, I suppose my description of Java as compiler-enforced Hungarian notation isn't entirely fair. I often wonder if Java's problems in terms of bearability are as much syntactic as they are semantic.

skybrian · on April 21, 2014

As an alternative, check out how Go does this. It's a nice compromise that lets you declare kilogram to be a float, and when used alone it treats it as a separate type, but it doesn't get in the way when you use it in an expression.

Jtsummers · on April 21, 2014

That seems useful, but dangerous. If you specify (for some reason) both a centimeter and inches type, would you be able to mix them in expressions without warnings?

--------

Per [1] it seems that since both would be named types you wouldn't be able to assign one to the other without explicit conversions. If my reading is correct, then that addresses my concern. It's things like storing a `float` to a `inch` type that can happen without explicit conversion as long as `inch` has float as its base type.

[1] http://golang.org/ref/spec#Properties_of_types_and_values

skybrian · on April 21, 2014

They are separate types, not a type alias, but some conversions are automatic. The rule for assignability also affects comparisons and function calls (including expressions containing them), but apparently not all expressions like I thought?

Dewie · on April 21, 2014

> type Kilogram = Double

type synonym/alias.

jarrett · on April 21, 2014

In Haskell? I wouldn't do that. That improves readability, but it doesn't allow the compiler to enforce correct unit usage. If a Kilogram is just a synonym for a Double, you can pass it in where other meanings of Double are expected. E.g. you can pass Kilograms into a function that takes Pounds. Which you don't want to do.

jweese · on April 21, 2014

That's why you want

    newtype Kilogram = Kilogram Double

which has no runtime overhead but does enforce the fact the Kilogram is a distinct type.

sparkie · on April 21, 2014

We probably wanna add to it to make it useful

    {-# LANGUAGE GeneralizedNewtypeDeriving #-}
   
    newtype Kilogram = Kilogram Double deriving (Show, Eq, Ord, Num, Real, Fractional, Floating, ...)

And remind ourselves that Haskell is really quite expressive.

dllthomas · on April 22, 2014

Though unfortunately can't handle (mass * accel :: Newton) without swapping out the Prelude.

antocv · on April 21, 2014

Ada does it all.

Dewie · on April 21, 2014

Yeah no kidding. I brought it up to show that you can have the same kind of thing that skybrian was talking about in Haskell.

skybrian · on April 21, 2014

My point was that named types in Go are not type aliases like in other languages.

kirab · on April 22, 2014

In C++14 this got easier.

Declaration:

  struct Kilogram {float v;};

Instantiation:

  Kilogram x = {1};

or

  Kilogram x{1};

pacaro · on April 21, 2014

F# has this as units of measure [1]

[1] http://msdn.microsoft.com/en-us/library/dd233243.aspx

coolsunglasses · on April 21, 2014

I've been teaching a bunch of Haskell lately, this is my guide for learning it:

https://gist.github.com/bitemyapp/8739525

And as the other commenter mentioned, Dimensional is indeed a cool library :)

rossjudson · on April 21, 2014

The type system is what the compiler (or interpreter) knows about your program. Sophisticated type systems know a lot, and can tell you a lot. Sophisticated does not mean typing a lot.

When building code for the long term, you really want to focus on techniques that make it impossible to misuse an API, whenever it's possible to do that. This article nicely calls out one such technique.

Generic typing can be used to (try to) force the consumer of an API into correct usage. Judicious use of final and abstract in the land of Java can be used force/guide eventual overriders of an abstract class down the right path -- they'll get compilation errors if they don't at least implement the right methods.

Whenever you find yourself writing an assert method, take a beat and figure out if the type system could have turned that into a compilation error instead.

dllthomas · on April 22, 2014

"The type system is what the compiler (or interpreter) knows about your program."

Compilers know a lot more than just what is represented in "the type system" (as it's typically thought of, anyway). Control flow (basic block analysis, &c) for instance. Which isn't really taking away from your larger point - certainly, things represented in the type system are things the compiler will know about.

"Whenever you find yourself writing an assert method, take a beat and figure out if the type system could have turned that into a compilation error instead."

Possibly with a _Static_assert (or static_assert) as if C11 (/C++11)! Obviously there are substantial limitations still, but it's great to be able to pull more to compile time cleanly.

sgarlatm · on April 21, 2014

This discussion reminds me of a 2005 article from Joel Spolsky called "Making Wrong Code Look Wrong". It has a similar point to this article: there are things we can do as programmers to make our code less error prone. It's definitely worth a read if you're interested in this topic and haven't read Joel's article before.

http://www.joelonsoftware.com/articles/Wrong.html

exabrial · on April 21, 2014

Type systems in languages make me less productive.

Unless I have to provide support, add features, scale my application, write bug free code, hire additional developers, or explain my thought process to anyone else.

axman6 · on April 22, 2014

Right, it's well known that type systems are essentially useless, except when you need to write high quality code, and do all those things you mentioned. Basically they're unnecessary (but only if you use the term unnecessary very literally).

hcarvalhoalves · on April 21, 2014

> val player = new Player(new Vector2(100, 100), new Vector2(50, 50))

> No problems here. The first noticible crack in the system comes after a few weeks vacation away from this code. You come back, and you want to change the starting point of the player. Will you remember which Vector2 to change?

    player = Player(position=Vector2(100, 100), size=Vector2(50, 50))

The problem is not typing here, it's lack of named arguments.

jasode · on April 21, 2014

Named arguments by themselves do not fundamentally prevent data type errors when the underlying types are identical. Short variable names (v1, v2) or some other cosmetic ambiguity will deceptively compile with named arguments as the only defense.

Both of the following would compile successfully using named arguments but one of them is semantically wrong:

Player(pos = v1, size = v2);

Player(pos = v2, size = v1);

Using a compiler's static type checking enforces correct semantics more than named arguments. Of course, this only works if the programmer creates new differentiated types so that they are no longer identical (which was the crux of the article.)

TheLoneWolfling · on April 21, 2014

I'd disagree with you: named arguments are good, but they don't save you in a lot of cases.

For example when you have one function returning (pos, size) and another function expecting (size, pos).

hcarvalhoalves · on April 21, 2014

Named arguments is syntatic sugar for packing/unpacking data structures in some languages, so you can leverage that. In Python:

    >>> from collections import namedtuple
    >>> Vector2 = namedtuple('Vector2', ('x', 'y'))
    >>> PlayerOptions = namedtuple('PlayerOptions', ('position', 'size'))
    >>> opt = PlayerOptions(position=Vector2(100, 100), size=Vector2(50, 50))
    >>> Player(**opt._asdict())

The order arguments are passed doesn't matter anymore. You can also enforce this calling convention with a constructor signature like this (in 3):

    >>> class Player(object):
    >>>     def __init__(self, *, size, position):

q845712 · on April 21, 2014

i agree that in languages without compiler defined types, but with other tools such as named arguments, the class of errors described here can be to some extent avoided by making good use of named arguments. I'm gonna do this more often in my python code!

felipehummel · on April 22, 2014

Although named arguments are good, you won't use it in all function calls.

judk · on April 22, 2014

That's tiresomely verbose.

sgt101 · on April 21, 2014

Yus.

Dynamically typed languages are expressive, quick and fun.

But strict static typing is like a seat belt, it's annoying, but it just might save your life.

Also ceremony code is there for a reason; that reason is not to annoy you, it is to make sure that those who come after you know what it is that you have done and why.

Also: comments.

Also: documentation.

Also: UML.

masklinn · on April 21, 2014

> But strict static typing is like a seat belt, it's annoying, but it just might save your life.

The problem with saying "static typing" without further precision is on one axis it ranges from C where the steatbelt is made of paper to ATS or Idris where the "steabelt" is a zero-zero ejection seat, and on an other axis it ranges from Java where you have to braid the steatbelt from raw fibers any time you sit down to MLs or Haskell where the seatbelt magically appears around you.

NAFV_P · on April 21, 2014

> The problem with saying "static typing" without further precision is on one axis it ranges from C where the steatbelt is made of paper to ATS or Idris where the "steabelt" is a zero-zero ejection seat, and on an other axis it ranges from Java where you have to braid the steatbelt from raw fibers any time you sit down to MLs or Haskell where the seatbelt magically appears around you.

Regarding C, it is weakly typed so it doesn't care whether you feed it apples or habanero chillis, the static type system isn't for safety, it's for improving compilation speed (which was often a big headache in the 1970's).

It sticks with the methodology of C: if you make a mistake you pay for it dearly.

seanmcdirmid · on April 21, 2014

> MLs or Haskell where the seatbelt magically appears around you.

Well, only if you don't need nominal subtypes.

DanWaterworth · on April 21, 2014

Honest question; Could you give an example where you need nominal subtypes?

seanmcdirmid · on April 21, 2014

If you like doing real object-oriented programming, then nominal subtypes are mighty useful.

spacemanaki · on April 21, 2014

Well here's another honest question: what's real object-oriented programming and where can I learn more about it?

seanmcdirmid · on April 22, 2014

There is an entire lineage of real object-oriented languages with nominal types starting from the dynamically typed smalltalk (research on which where we learned H&M was horribly ill suited to OOP in the first place). Or if you prefer, just look at how you use your natural language to name things and reason about the world.

acjohnson55 · on April 22, 2014

On the other hand, so many people have abandoned the most common forms of OOP. See for example the story from yesterday [1]. Many writers have written about how brittle object hierarchies are in anything nontrivial, [2] for example.

I will say though that newer models of OOP, focusing on interfaces and composability of behavior, seem to have quite a bit of steam left in them.

[1] https://news.ycombinator.com/item?id=7618933

[2] http://raganwald.com/2014/03/31/class-hierarchies-dont-do-th...

seanmcdirmid · on April 22, 2014

Don't count OOP as being dead yet, the FP stuff is useful (I use it myself), but we have our own better abstractions coming out, like having more fun with mixin-style inheritance. The FP + OOP crowd is pretty vocal, I'll give you that.

masklinn · on April 21, 2014

Yeah, conditions may apply, same for the other categories. Then again, we're (literally) in car-analogy land here so meh.

altcognito · on April 21, 2014

The problem with magic seatbelts is that all the tooling surrounding the magic has to know how the magic works as well.

camus2 · on April 21, 2014

Dynamic typing is fine,what is not is type coercion and weak typing.But Dynamic vs Static is not the most important debate.

dllthomas · on April 22, 2014

You lost me at UML.

205guy · on April 21, 2014

One language that enforces strong typing is Ada. It has compiler-time and run-time type checking. It also had many other features to make the code less error-prone and maintainable. It was used by the DoD and also used in critical applications, such as nuclear power plants. The early versions of Java reminded me very much of Ada.

axman6 · on April 22, 2014

As far as I know, there's no runtime type checking of Ada, but there is runtime value checking to ensure values are within the specified ranges/adhere to the specified predicates (a very cool feature; you can specify that a type is always even, and encountering an odd value will cause an exception).

darkestkhan · on April 22, 2014

Oh, there is runtime type checking, though usually it is optimized out at compile time: it is for subtypes (which is mostly range checks) and checks of tags for tagged types (usually in class wide subprograms). Though both of them are often optimized out.

kjs3 · on April 22, 2014

I'm hoping programming grows up enough to remember languages like Ada. Dynamic languages are useful for "get something running quick, then work on getting it right". Ada makes you get it right up front, or it doesn't even compile. The former is good if you have to bang out a script this afternoon; the later is good if you have to maintain a code base for 10+ years, with multiple programmers, and some assurances about reliability.

bglazer · on April 21, 2014

From a performance point of view, is there a significant penalty associated with these "small types"?

For example, kilogram as wrapper around float?

kasey_junk · on April 21, 2014

The way these are implemented, the short answer is yes. There will be more memory pressure on the garbage collector in Scala. The long answer is, "mmmm maybe, you should test it" because due to escape analysis, eden generation collecting speed, JIT, etc. It can get very complicated very fast. It can also be very dependent on what you mean by "significant penalty" and "performance".

As an aside, Scala does allow for compile time only small types in the form of AnyVal's. They won't help in the examples in the article, but in the simpler kilogram wrapper around float it will and there will be no performance penalty as it will be elided by the compiler.

Terr_ · on April 21, 2014

Or to reuse an adage: "It's easier to optimize correct code than to correct optimized code."

You can always find the slow part and change it to use raw floats/ints/etc.

radicality · on April 21, 2014

It depends on the implementation. I'm not familiar with Scala, but in Haskell, using `newtype`s to achieve this creates absolutely no extra runtime overhead, only compile-time overhead when the types are checked.

axman6 · on April 22, 2014

This isn't completely true, there are issues which come up with code like

    newtype Foo = Foo Int

    map (\(Foo x) -> x) listOfFoos

which should be a no-op at runtime because the representation of Foo x and x are the same, but due to the newtype, the traversal does need to occur. [1] discusses a way to avoid these problems. Basically these newtype should only ever be needed during type checking, and should really be erased after that; once you've type checked everything, then you know your program won't ever treat something which is a non Foo Int as an Int, so you can then get rid of the compile time tag and gain better optimisations.

[1] http://research.microsoft.com/en-us/um/people/simonpj/papers...

marcosdumay · on April 21, 2014

As a general rule, types are dealt with at compile time, and will only influence the optimizer leading to faster code...

But as people already pointed, in practice compilers are never optimum. When it's important (that is, rarely) it's good to test, or at least read the compiler's manual.

evntdrvn · on April 21, 2014

See also this blog post about using using single-member structs in C to achieve similar goals: http://spin.atomicobject.com/2014/03/25/c-single-member-stru...

NAFV_P · on April 21, 2014

I came across this a few weeks ago, I think the first example is to do with arrays being second class.

On the other hand, if you give data to a C-programmer, they can locate its position in RAM. Give an assembly-programmer a simple data-type, they can point to a specific register(s) in the processor.

charlieflowers · on April 21, 2014

The general principal makes a ton of sense: let each line of your code express as much of your intent as possible. That way, when you have a huge codebase, there are all these useful little "hooks" throughout it that will be useful when you need to refactor.

Your codebase will "say" more, and therefore your tools will "know" more, and therefore your tools can do more for you.

charlieflowers · on April 22, 2014

Oh, but one more thing. It's very nice if the abstractions you use to do this are resolved down to nothing at runtime. So there's no penalty at runtime for the precision you added to the code. Let the compiler do the hard work so that you can say "Position", but at runtime, the code is as if you had written "Float".

ZenoArrow · on April 21, 2014

F# allows you to do these 'small types' in a few different ways, including unit annotations... http://fsharpforfunandprofit.com/posts/units-of-measure/

jackcarter · on April 21, 2014

I've seen this paradigm called "Tiny Types" before: http://darrenhobbs.com/2007/04/11/tiny-types/

darkestkhan · on April 22, 2014

I'm actually surprised that people learn about this only now... It is quite standard (and very common) thing to do in Ada. But then we can "subtype Strength is Natural range Natural'First .. 32;" since 80s (numeric type with all operators defined with permitted range of values from 0 to 32; range checks are performed at runtime by compiler - but most of them are optimized out anyway so penalty for them is in single digit percents of performance). Or if you prefer example from that site: "type First_Name is new String;"

iaskwhy · on April 21, 2014

While I agree with the idea of the article, what's the logic for not using "small types" for all parameters? Or, to put it another way, how do you decide a given method should have "small types" used for parameters?

Also, although it makes the code much more verbose, named parameters (as used in, for example, Objective-C) fix any eventual bug with refactoring methods' signatures, right?

Terr_ · on April 21, 2014

> Or, to put it another way, how do you decide a given method should have "small types" used for parameters?

I think a good rule of thumb is to ask yourself what the "dimension" or "unit" the parameter has. (See also: Dimensional Analysis.) You never want to pass 4.33f radians into a function expecting 4.33f newtons!

A few more examples:

Distance in meters, Distance in feet, Speed in m/s, speed in yards/minute, Acceleration in m/s^2, Acceleration in cm/s^2, Mass in grams, Weight in pounds, Force in newtons, Force in dynes, Temperature in Celsius, Temperature in Fahrenheit, Angles in radians, Angles in degrees....

"But I don't do physics programming!", you say? Well, there's plenty more:

Time in seconds, Time in milliseconds, Distance in pixels, Distance in inches, Numeric ID of a Foo, Numeric ID of a Bar, Bytes in UTF8, Bytes in ASCII, US currency in dollars, US currency in cents, Euros, interest rate (yearly), interest rate (monthly)...

And that's not even getting into industry-specific dimensions that might exist.

dragonwriter · on April 22, 2014

> Distance in meters, Distance in feet

That brings up an interesting issue -- is the "right way" to do that to have different types for these, or one type for distance with different factory function/methods for different units. E.g.: should "meters" and "feet" be different types, or should the "measurement" module have both a "meters" and "feet" method that return the same ("distance") type.

It seems to me the latter is more conceptually clean.

OTOH, math might be easier to express with the former (particularly in a language that allows you to define implicit type conversions, so that if you pass an "inch" value to a function expecting a "cm" parameter, it gets automatically converted to the appropriate "cm" value.)

Terr_ · on April 22, 2014

Well, there are really three different kinds of information involved. For example, the variable "target_resist" might have: Computational type (float), Dimension (resistance), and Unit (milliohms).

Most built-in type-systems (sensibly!) only try to tackle the broadly-applicable computational-type, since the rest is context-specific.

However, in a contractual sense, all of those are important, if any are not as expected, you'll get bugs.

iaskwhy · on April 22, 2014

Ok, I can understand that logic but then you might end up with dozens and dozens of "small types" so, as far as I understand, those who try to follow that paradigm have no problem whatsoever with the potentially high number of these types, is it so?

(Again, asked another way, is it ever a problem to create a "small type" if it follows that rule? For some codebases it might mean having a "small type" per unique parameter on signatures.)

Terr_ · on April 22, 2014

> Again, asked another way, is it ever a problem to create a "small type" if it follows that rule?

IMO the biggest problem isn't having too many distinct types, because they represent information you'd need to track manually anyway, like knowing that $time is seconds rather than milliseconds. I'd be more worried about:

1. The risk of conflicts between different groups of people who both defined their own type called called "Newtons" and want their libraries to work together.

2. The sliding-scale of inconvenience if your implementation gets in the way of the mathematics, like having to call `foo.asFloat() * 2` rather than `foo * 2`. (You can relax things a bit by relying on best-effort annotations.)

iaskwhy · on April 22, 2014

Thank you for your insights, it's always great to learn new things :)

benaiah · on April 21, 2014

The example given in this post doesn't show this, but what using types allows us to do is understand what a specific value is throughout its whole lifetime through all sorts of functions. We know that this vector is a velocity, and our compiler knows that too, so we can't unwittingly treat it as an acceleration vector further on down the line, whether that's done when passing it to another function, declaring variables, adding things together, or what-have-you. You are right that the specific example shown in the article can be achieved with simple named parameters, but that's a very small and specific application of the very broad protection that proper type use gives you.

TheLoneWolfling · on April 21, 2014

Named parameters don't help in the case where you have (for example) one function returning (pos, size) and another function expecting (size, pos).

be5invis · on April 21, 2014

According to Curry-Howard correspondence, types are conclusions and programs are proofs of the conclusion. Any program is corresponded to a deduction procedure, such as a Hilbert-style proof or natural deduction proof.

th3iedkid · on April 21, 2014

some of the best pieces in type systems (and model theory too!) were from benjamin pierce and his book on type-systems.