Hacker News new | past | comments | ask | show | jobs | submit login
Return type polymorphism in Haskell (thegreenplace.net)
116 points by osopanda on Feb 5, 2018 | hide | past | favorite | 72 comments



Yeah, this is a really great Haskell feature, and it often feels like magic the first time you see it. It's key to a whole lot of useful Haskell idioms.

Well-known languages which do this include Haskell and Rust (where it's also useful). In Rust, you can write things like:

    // Generate an appropriate default value for a type.
    // Here, this will create an empty vector of String.
    let my_vec: Vec<String> = Default::default();
    
    // Parse JSON into the specified type.
    let numbers: Vec<u32> = serde_json::from_str("[1, 2, 3]")?;
I think there may be ways to do something similar with C++ template metaprogramming and partial specialization, and I wouldn't be surprised if some of the ML family supported something similar using "modules". (But I don't really understand ML modules, so don't quote me on that.)

It's a great feature, and I would love for more strongly-typed languages to support it.


Yes, it feels great to just drop `read` into a context with enough type information to infer the return type.

On the other hand, when the type context is not sufficient, it gets a bit bulky:

   -- with sufficient type context:
   xs <- map read <$> lines <$> getContents

   -- an ugly inline type signature:
   xs <- map (read :: String -> Int) <$> lines <$> getContents
For many such calls it's good to monomorphize:

   readInt :: String -> Int
   readInt = read
but it still does not feel elegant (what about other return types?).

Edit: obviously, this must have been `xs <-`, not `let xs =`.


Is magic worth anything if it doesn’t work on 100% of its surface?

Isn’t “use magic, except in cases where the magical internals don’t apply—then be explicit” a strictly worse API than “just be explicit”?


Return type polymorphism has usecases where it is irreplaceable (see other comment threads).

I am just pointing out an example of bad use, which is not a problem of the language, just of this specific API.

As for APIs in general, sometimes they do need some leeway for future extension.


Couldn't you just type `xs` as `[Int]` and let the compiler work itself backwards from there?


I don't know how can I actually do this inside a do-block:

    xs :: [Int] <- ...
does not work without `ScopedTypeVariables` (and I don't like proliferation of redundant language extensions).

In Rust, sure, it's not a problem and plays well with the language.

My point is, it is easy to abuse this feature (this example is from Haskell Prelude!), even though it is indispensable in some situations (like `mempty :: a`).


You can either type the return value at the end of your line:

    xs <- map read <$> lines <$> getContents :: IO [Int]
Or you can force the xs type on some other line:

    let _ = xs :: Int
There is a recent idiom for that last format that lets you write on the same line as the expression. I think it's something like:

    (xs @ [Int]) <- map read <$> lines <$> getContents
But I haven't used it yet and I am not sure exactly what version of GHC allows it.


Today I learned that `:: IO [Int]` at the end of an expression is applied to the whole expression. Thanks!


> does not work without `ScopedTypeVariables` (and I don't like proliferation of redundant language extensions).

It does something useful for you! How can it be redundant?


You don't like the part where compiler authors are free to make efforts to improve the language that are clearly delineated as not part of the standard and easy to not enable if they prove to not be a good idea in practice?

What are the alternatives? Completely discarding the language standard? Never making attempts to improve the language? Requiring all experiments to go through a standardization committee before real-world testing?

None of those strike me as obviously better.


I am definitely not against languages extensions in general. I think that it's one of the unique features of Haskell that keep it stable and flexible at the same time.

If I already have `ScopedTypeVariables` in scope for other valid reasons, I am happy to use it. It's just that I'd rather not enable it for tiny cosmetic improvement like this.


If it makes you feel any better, I suspect that ScopedTypeVariables is basically how it should have worked in the first place. (The default behavior is pretty weird/surprising.)


Almost all Haskell code written these days uses at least a handful, if not a barrel-full, of extensions, and ScopedTypeVariables is one of the most common ones. Of course, you’re completely free to do whatever you’d like with your own code, but you’ll probably find it hard to avoid extensions — at the least many of them will be used by some libraries you’re depending on.


In general you can't do it in C++. There is no way to propagate the result type to the function call itself.

The workaround it to return a proxy object with a template conversion operator, which is very effective but it is a leaking abstraction.

edit: as long as you are willing to manually specify the result type you can of course have an additional template argument which is not deduced from any function arguments.


This is part of the reason why I still feel constrained when using systems where the type system is bolted on after the fact, like Typescript, Flow, or Dialyzer. You can gain so much leverage from the type system being able to generate code like this for you.

ML can do this, but you have to mention the module explicitly. OCaml's upcoming modular implicits will allow this with less boilerplate though.


If I've understood this correctly, you can more or less do this in C++ by using voldemort types.

Here's a nonsensical but demonstrative example program:

  #include <string>
  #include <iostream>
  
  auto polyret()
  {
      struct proof_of_concept
      {
          operator int() { return 4; }
          operator double() { return 1.5; }
          operator std::string() { return "I'm a string!"; }
      };
      return proof_of_concept();
  }
  
  int main()
  {
      int a = polyret();
      double b = polyret();
      std::string c = polyret();
      std::cout << "int: " << a << std::endl; //prints 4
      std::cout << "double: " << b << std::endl; //prints 1.5
      std::cout << "string: " << c << std::endl; //prints "I'm a string!"
  }


Oddly, at the totally other end of the type-system spectrum, the "wantarray" feature of Perl allows you to implement this behavior yourself for a subset (two, in this case) of Perl's main data types* .

http://perldoc.perl.org/functions/wantarray.html

You have to wire it up yourself, and it happens at runtime (because most everything interesting in Perl does) but your function can alter its return type from a scalar to a vector (list, in Perl parlance. Pearlance? Nevermind) based on the way in which it is invoked.

I've seen this used for great evil and great good, but regardless: it's an interesting case in point that such features can be present (albeit in really different forms) and useful even in languages that bear very little resemblance to Haskell/Rust/etc.

*contexts, in this case, but it allows for equivalent behavior to return-type polymorphism in a very limited way.

Edit: formatting.


Swift supports this. Having said that, I don't think I've encountered it in the standard library.

I recently started using the Decodable protocol and was surprised to find out that the `decode` methods all take an explicit type, e.g. `KeyedDecodingContainer.decode(Double.Type, forKey: KeyedDecodingContainer.Key)`. There's a method for each supported type. [1]

Not sure why they didn't just have one method to rule them all in the public interface, like this:

    extension KeyedDecodingContainer {
        public func decode<T: Decodable>(key: KeyedDecodingContainer.Key) throws -> T {
            return try decode(T.self, forKey: key)
        }
    }
The usage then becomes much simpler (assume `name` is a property of type `String`):

    // Before:
    name = try container.decode(String.self, forKey: .name)
    // After:
    name = try container.decode(key: .name)
[1]: https://developer.apple.com/documentation/swift/keyeddecodin...


Once you've got return type polymorphism, you really start to miss it in other languages. The simplest example possible is "mempty"

mempty :: a

gets the "default" value of a. Which makes no sense in a language where you need an instance to have polymorphism.

(This, incidentally, is also why all OO serialization libraries are awful.)


Not sure if I exactly follow, but this is an implementation of Monoid in C#. The interface can be seen as the type-class definition. The structs are the equivalent of class instances.

If you look at the `static class Monoid` then you can see a general implementation of mconcat which returns an A and works with Empty and Append.

The Program at the end shows it in use with List and String types.

    public interface Monoid<A>
    {
        A Empty();
        A Append(A x, A y);
    }

    public struct MString : Monoid<string>
    {
        public string Append(string x, string y) => x + y;
        public string Empty() => "";
    }

    public struct MList<A> : Monoid<List<A>>
    {
        public List<A> Append(List<A> x, List<A> y) => x.Concat(y).ToList();
        public List<A> Empty() => new List<A>();
    }

    public static class List
    {
        public static S Fold<S, A>(this IEnumerable<A> ma, S state, Func<S, A, S> f)
        {
            foreach(var a in ma)
            {
                state = f(state, a);
            }
            return state;
        }

        public static List<A> New<A>(params A[] items) => new List<A>(items);
    }

    public static class Monoid
    {
        public static A Concat<MA, A>(IEnumerable<A> ma) where MA : struct, Monoid<A> =>
            ma.Fold(default(MA).Empty(), default(MA).Append);

    }

    class Program
    {
        static void Main(string[] args)
        {
            var strs = new[] { "Hello", ",", " ", "World" };
            var lists = new[] { List.New(1, 2, 3), List.New(4, 5, 6) };

            var str = Monoid.Concat<MString, string>(strs);
            var list = Monoid.Concat<MList<int>, List<int>>(lists);
        }
    }
Obviously it's not as elegant as Haskell, but does this not fit your requirement?


Can you see how that small change on the interface makes it awful to write code that is generic on Monoid?

It does not fit the requirements. It makes the kind of code people write in Haskell absolutely not viable.


Why don't you deal with specifics? I am not talking about its attractiveness, I have already made clear that I think Haskell is more elegant. But, why is this "not viable" or "does not fit the requirements"? I have already shown how a totally generic version of mconcat can be implemented - so how is this not writing "code that is generic on Monoid"?


One very specific. I create a logging module on Haskell. There I put some high order functions like this:

    module Log where

        -- | Runs the code inside a catch, logs any exception
        exceptions :: IO a -> IO a
        
        -- | Logs every code execution
        access :: IO a -> IO a

        -- And so on, for several different kinds of logging
Then on the main code I do:

    import qualified Log as Log

    main = do
        -- Lots of stuff
        Log.errors . Log.access $ readSomeData
        Log.errors $ whateverThatCanFail
        Log.access $ shoudlntFailButICareAboutRunning
On C# each use of those logging functions will be more verbose than copying the entire function body in place. And almost as brittle.


Sorry, where did I say C# is less verbose than Haskell? And what has that got to do with implementing polymorphic return values?


You're right, that does work, but you're having to declare every type everywhere. Each time I add another operation to your example, the type declarations get worse and worse. My point is, return type polymorphism and ad-hoc polymorphism make a whole bunch of things ergonomic. The fact that C# can do the same thing in a way that no-one would use is kind of the point.


> declare every type everywhere

What does that mean? The types are declared exactly once.

If you mean C# doesn’t do type inference, well yeah.

But, I still don’t see how this changes my original point, which was merely to demonstrate how to do return type polymorphism in a language other than the sacred Haskell. As the original post gives the implication that it’s not achievable when not working in Haskell.


Ah, I see the confusion. Shouldn’t have used “declare” in that context. My point was that you needed to specify the type every time you used it. So the more operations you add the more your code looks like generic soup.

I think you’re confused as to what return type polymorphism is, though. It’s the ability to have the compiler infer the type of something from its site of use. So your example doesn’t exhibit it because the types need to be specified.

So, the following code works in Haskell

x = [1,2,read “3”]

y = [“1”,”2”, read ”3”]

In the first case read returns an Int, in the second a String. This is a useless toy example but it turns out to be really useful and to make ergonomic a huge number of things that are just painful in the languages we normally use.

Everything’s achievable in every language, but what’s convenient changes massively.


> I think you’re confused as to what return type polymorphism is, though. It’s the ability to have the compiler infer the type of something from its site of use.

I think you may be confused about what return type polymorphism is tbh. Type inference != polymorphism. Polymorphism is a type-system feature. Type inference is a separate process to ascertain concrete types at compile time.

For example, the only type inference that C# really does is `var`:

    var x = foo();
That is no different to me writing:

    int x = foo();
(assuming that foo returns an int that is).

The compiled version of both of those code snippets are the same. The fact that I specified the `int` directly doesn't make those chunks of code any different. Or, make one less valid.

You're right it's absolutely the case that to do this I'd need to specify the types I want to work with, and that's because C# is shit at inferring anything. It doesn't change the fact that the return type is polymorphic for Concat in my example. It's _ad-hoc polymorphic_, and no lack of type-inference will change that.

But this still misses the entire point of what I wrote. Which was to counter the point made in your original post:

> Once you've got return type polymorphism, you really start to miss it in other languages.

Initially I felt I was helping by pointing out that if you're missing it in other (mainstream) languages, then there's absolutely a way of doing it.

> The simplest example possible is "mempty" mempty :: a > gets the "default" value of a. Which makes no sense in a language where you need an instance to have polymorphism.

And secondly I felt it important to use your example of `mempty` to show how to do it.


This uses an unsafe "default(MA)" construct to hack around the type system, right? There's no way to write code like this and not have your code fail with NPEs at runtime except for manually inspecting every "default(...)" call to check that it's called on the right kind of type.


Wrong, MA is constrained to struct; and structs can’t be null


Ok, but someone has to manually check that, since someone could write "default(MA)" with MA not being a struct and this wouldn't be obvious at the call site. And even if we do find a way to automatically enforce that it is a struct, default won't necessarily put it in a valid state, right? (e.g. if the struct contains reference types then we've just moved the problem one step down: the struct can't be null but the things inside the struct can be null).

Edit: Also does this "default" mechanism extend to allowing us to compose typeclass instances out of smaller typeclass instances? E.g. the monad instance for Writer is defined as:

    instance (Monoid w) => Monad (Writer w) where 
        return a             = Writer (a,mempty) 
        (Writer (a,w)) >>= f = let (a',w') = runWriter $ f a in Writer (a',w `mappend` w')
i.e. we can obtain a Monad<Writer<W, ?>> for any W for which we have a Monoid<W>.


Yes, it’s possible for programmers to write bugs.


Well if you don't care about type safety then there's no point caring about any typesystem features, since you can emulate them by replacing all of your types with "any".


Sorry, where on earth did I say I don’t care about type safety? Why do you need to take this point to a total extreme? I simply gave an example of why the comment about mempty was wrong; but now I have to defend C#’s type system?

Clearly C#’s lack of type inference, sanctioned ad-hoc polymorphism (even though it can’t be achieved in the way I have shown), and higher kinds makes it less expressive as a language. I’m not going to argue that point.

But this kind of language holy war is frankly pathetic. Attacking every detail of an implementation (that works) is unnecessary.

Yes, it’s easier to get null reference exceptions in C# compared to Haskell. That is the result of poor decisions made when the language was designed. So, yes, today I will use ad hoc polymorphic techniques and yes I will have to make sure I constrain to structs, that’s life.


> Sorry, where on earth did I say I don’t care about type safety? Why do you need to take this point to a total extreme? I simply gave an example of why the comment about mempty was wrong; but now I have to defend C#’s type system?

If you're going to dismiss safety issues in your approach with "Yes, it’s possible for programmers to write bugs." then there's no point having the conversation, because that's an equally good argument for not having a type system at all.

> But this kind of language holy war is frankly pathetic. Attacking every detail of an implementation (that works) is unnecessary.

It's not a "detail", if you can't do it safely then that undermines the point of doing it at all. If we were willing to be unsafe we could just cast to the desired type.

> Yes, it’s easier to get null reference exceptions in C# compared to Haskell. That is the result of poor decisions made when the language was designed. So, yes, today I will use ad hoc polymorphic techniques and yes I will have to make sure I constrain to structs, that’s life.

I'd sooner pass the module dictionary explicitly, like one does in ML, than adopt a technique that would normalize having "default(...)" in my codebase.


> If you're going to dismiss safety issues in your approach with "Yes, it’s possible for programmers to write bugs." then there's no point having the conversation, because that's an equally good argument for not having a type system at all.

Absolute nonsense. I didn't dismiss safety issues at all. I dismissed your claim that having to specify a `struct` constraint somehow makes the feature unworthy.

C# has null, that's a fact of life, it's not dismissive to realise that a (granted, very annoying) part of the job of writing C# is dealing with null. So, using this doesn't make this technique any less safe than any other way of writing code in C#. So, yes, programmers will occasionally write null-dereference bugs in C# - that's the price we pay for bad language implementation decisions.

Stating "that's an equally good argument for not having a type system at all." is clearly hyperbolic nonsense.

> If we were willing to be unsafe we could just cast to the desired type.

But it isn't unsafe! Not specifying a `struct` constraint is a bug. If you provide the constraint then it's safe. Trying to compare that to a dynamic cast where you have no type-system enforcement to one where you do is just idiotic.

> I'd sooner pass the module dictionary explicitly, like one does in ML, than adopt a technique that would normalize having "default(...)" in my codebase.

At no point was this trying to force you to use this technique. It was a reply to "Once you've got return type polymorphism, you really start to miss it in other languages. The simplest example possible is mempty".

I use this technique very successfully a lot, and the exact mechanism (of using `default`) is in the process of being wrapped up into a new type-classes grammar for C# [1]. So, I guess you'd probably prefer to wait for that...

[1] https://github.com/MattWindsor91/roslyn/blob/master/concepts...


> C# has null, that's a fact of life, it's not dismissive to realise that a (granted, very annoying) part of the job of writing C# is dealing with null. So, using this doesn't make this technique any less safe than any other way of writing code in C#.

If using this technique requires breaking one of the rules that you have to follow to avoid getting nulls in C# then the technique is a safety problem.

> Not specifying a `struct` constraint is a bug. If you provide the constraint then it's safe.

Ok, but how do you enforce that? If you've got a technique that requires manual review and reasoning at a distance to use safely, then again we're no better off than we would be using dynamic casts.

> At no point was this trying to force you to use this technique. It was a reply to "Once you've got return type polymorphism, you really start to miss it in other languages. The simplest example possible is mempty".

If you don't have a typesystem feature in a safe way, you don't have it.


> Ok, but how do you enforce that? If you've got a technique that requires manual review and reasoning at a distance to use safely, then again we're no better off than we would be using dynamic casts.

More hyperbole. Failing to constrain may lead to a null reference exception. Just like passing a reference to any method anywhere in C#. It is no better and no worse than any other C# code. However it does allow for ad-hoc polymorphic return values. Which is the entire point. That is not the same as returning a dynamic value, which is a type that propagates dynamic dispatch wherever it's passed. A failure to capture a null reference bug means on first usage it will blow up - so you fix the code and everything is type safe.

> If you don't have a typesystem feature in a safe way, you don't have it.

The feature is safe. Your argument is the same as saying C# doesn't have classes because a reference can be null, or C# doesn't have fields because a field can be null. All throughout this frankly tedious discussion you have somehow conflated having a bug in an application with having no type system at all. C#'s type system is obviously nowhere near as impressive as Haskell, but C# is actually used in the real world much more, and so if someone wants polymorphic return values then they can. I mean they can anyway through inheritance, never mind the ad-hoc approach I demonstrated - but whatever yeah?


> Failing to constrain may lead to a null reference exception. Just like passing a reference to any method anywhere in C#.

But you can adopt a small set of rules that are locally enforceable (and practical to use in an automatic linter) to prevent this happening (just as Haskell is safe even though unsafePerformIO exists, because you can adopt a small set of locally enforceable rules like "never use unsafePerformIO"). Unfortunately one of those rules has to be to never use default().

> That is not the same as returning a dynamic value, which is a type that propagates dynamic dispatch wherever it's passed. A failure to capture a null reference bug means on first usage it will blow up - so you fix the code and everything is type safe.

Unfortunately default() isn't fail-fast in all cases - when used with e.g. a struct type containing a reference type, it will create the value in an invalid state (containing a null reference) but you won't necessarily notice until you come to use the value, arbitrarily many compilation units away. So it's just as dangerous as a dynamic value.

> All throughout this frankly tedious discussion you have somehow conflated having a bug in an application with having no type system at all.

In almost any language you can have polymorphic return values without complete type safety. The feature that Haskell has here isn't that you can have polymorphic return values - it's that you can have polymorphic return values safely. Showing an unsafe implementation of polymorphic return values in some other language is pointless and irrelevant.


> Unfortunately default() isn't fail-fast in all cases

It's purely a means of dispatch, if someone wants to put member variables in that are never used - good luck to them. For some reason you think that because C# doesn't protect you from being an idiot you can't do return type polymorphism. Well that's completely incorrect and you know it. The reference of default(A) isn't something that's passed around - yes the method you dispatch to has access to `this`, but what's the point of A: declaring a variable in a 'class instance' and B: using it when it's in an invalid state? It's what a moron would do. I don't call `((string)null).ToString()` because it's fucking stupid. But I assume in your world that means C# can't do method dispatch by reference?

Just because somebody can do something stupid doesn't devalue any particular technique that requires you to not do the stupid thing. Otherwise, you may as well delete C# as a language - because it's trivially easy to do stupid things. In fact software engineering wouldn't even have gotten off the ground if that was a pre-requisite.

But clearly people do produce software in it - which proves your arguments wrong.

> Showing an unsafe implementation of polymorphic return values in some other language is pointless and irrelevant.

Show me where it was mentioned in the original comment about safety? Not there is it. You just jumped in with inaccurate claims and went on some tangent about type-system safety, like C# would ever win any type-system safety contests.

Leaving asside the fact that all of your arguments about safety are nonsense for the moment... let's do it another way ...

    public interface Monoid<MA, A> where MA : struct, Monoid<MA, A>
    {
        A Empty();
        A Append(A x, A y);
    }

    public struct MString : Monoid<MString, string>
    {
        public string Append(string x, string y) => x + y;
        public string Empty() => "";
    }

    public struct MList<A> : Monoid<MList<A>, List<A>>
    {
        public List<A> Append(List<A> x, List<A> y) => x.Concat(y).ToList();
        public List<A> Empty() => new List<A>();
    }

    public static class Monoid
    {
        public static A Concat<MA, A>(IEnumerable<A> ma) where MA : struct, Monoid<MA, A> =>
            ma.Fold(default(MA).Empty(), default(MA).Append);
    }

    class Program
    {
        static void Main(string[] args)
        {
            var strs = new[] { "Hello", ",", " ", "World" };
            var lists = new[] { List.New(1, 2, 3), List.New(4, 5, 6) };

            var str = Monoid.Concat<MString, string>(strs);
            var list = Monoid.Concat<MList<int>, List<int>>(lists);
        }
    }
That is now safe in that `Concat` can't be implemented without the `struct` constraint, the code will fail to compile. Also the types that implement `Monoid<MA, A>` must be structs.

I'm out of this discussion now - because if you're still going to claim this is unsafe then you're clearly trolling and I haven't really got the motivation to keep feeding you.


> The reference of default(A) isn't something that's passed around - yes the method you dispatch to has access to `this`, but what's the point of A: declaring a variable in a 'class instance' and B: using it when it's in an invalid state?

It's not something you'd deliberately do, but in any decent-sized codebase, everything the language permits will happen. If it's possible to exclude a given pitfall with a simple, local lint rule then you might be able to avoid it, but manual review of anything that can happen at a distance is doomed to failure.

> I don't call `((string)null).ToString()` because it's fucking stupid. But I assume in your world that means C# can't do method dispatch by reference?

Unless you can use a very simple set of local rules to avoid having that happen, yes. Fortunately, there is such a set of rules you can follow (namely never writing null, never using constructs that return null, and checking the return values of library calls for null immediately) and so null (barely) doesn't destroy the language completely.

> Just because somebody can do something stupid doesn't devalue any particular technique that requires you to not do the stupid thing.

If your technique makes it impossible to use simple rules to avoid doing the stupid thing, then yes, that does devalue the technique. Because at that point having the stupid thing happen in your codebase is just inevitable.

> Otherwise, you may as well delete C# as a language - because it's trivially easy to do stupid things.

I already did, thanks.

> In fact software engineering wouldn't even have gotten off the ground if that was a pre-requisite.

Nonsense. Typed lambda calculi predate mechanical computers and don't allow you to do anything stupid. We could've built software engineering on them.

> But clearly people do produce software in it - which proves your arguments wrong.

People produce software in C#, but it takes more effort and has higher defect rates than doing so in Haskell-like languages.

> Show me where it was mentioned in the original comment about safety? Not there is it.

It's implicit because a) Haskell is a safe-by-default language b) return type polymorphism without safety is completely trivial. In e.g. Python you can just have Concat return "", [], or something else; likewise you can do the same in C# if you're happy to cast. So clearly moomin can't miss just being able to have a function that returns "" or [], because what language could they possibly be working in where that would be impossible or even at all difficult?

> That is now safe in that `Concat` can't be implemented without the `struct` constraint, the code will fail to compile. Also the types that implement `Monoid<MA, A>` must be structs.

But a) I have to allow "default(MA)" expressions in my program, which means I have no way to ban the unsafe use of default() b) nothing stops an implementation of Monoid<MA, A> being a struct that contains a reference, in which case that reference will be null when the struct is initialized with default(). It doesn't solve the problem at all.


That's why be put a struct constraint in. We'll quietly ignore the fact that non-primitive structs are as rare as hens teeth in most C# code-bases...


So you might be able to use some kind of linter to enforce that you only ever call default(MA) where MA: struct, but even then, is it safe to assume that default(x) instantiates x in a valid state for all structs? Wouldn't that then mean that e.g. you couldn't ever use a struct containing a reference type anywhere in your codebase, since if you do then default(x) initialises that struct to contain a null reference, right?


The structs used as ‘class instaces’ aren’t statefull.

The compiler will actually optimise out the ‘default’ also, so it’s as efficient as calling a static method.

Anyway, nobody is arguing that this is some perfect system, merely that return type polymorphism can be achieved _relatively_ painlessly in a language other than Haskell.


OTOH, I feel it's taken too far in the regex packages. They feel extremely daunting for someone just getting started with Haskell. E.g. the wiki says that Text.Regex.TDFA is the thing to use. First try:

    λ> "abc" =~ "(a|b).*"

    <interactive>:1558:1-18:
        Non type-variable argument
        in the constraint: RegexContext
                            Text.Regex.TDFA.Regex source1 target
        (Use FlexibleContexts to permit this)
        When checking that ‘it’ has the inferred type
        it :: forall source1 target.
                (Data.String.IsString source1,
                RegexContext Text.Regex.TDFA.Regex source1 target) =>
                target

    <interactive>:1558:7-8:
        Could not deduce (RegexMaker
                            Text.Regex.TDFA.Regex CompOption ExecOption source0)
        arising from a use of ‘=~’
        from the context (Data.String.IsString source1,
                        RegexContext Text.Regex.TDFA.Regex source1 target)
        bound by the inferred type of
                it :: (Data.String.IsString source1,
                        RegexContext Text.Regex.TDFA.Regex source1 target) =>
                        target
        at <interactive>:1558:1-18
        The type variable ‘source0’ is ambiguous
        Note: there are several potential instances:
        instance RegexMaker
                    Text.Regex.TDFA.Regex
                    CompOption
                    ExecOption
                    Data.ByteString.Internal.ByteString
            -- Defined in ‘Text.Regex.TDFA.ByteString’
        instance RegexMaker
                    Text.Regex.TDFA.Regex
                    CompOption
                    ExecOption
                    Data.ByteString.Lazy.Internal.ByteString
            -- Defined in ‘Text.Regex.TDFA.ByteString.Lazy’
        instance RegexMaker
                    Text.Regex.TDFA.Regex
                    CompOption
                    ExecOption
                    (Data.Sequence.Seq Char)
            -- Defined in ‘Text.Regex.TDFA.Sequence’
        ...plus one other
        In the expression: "abc" =~ "(a|b).*"
        In an equation for ‘it’: it = "abc" =~ "(a|b).*"

    <interactive>:1558:10-18:
        Could not deduce (Data.String.IsString source0)
        arising from the literal ‘"(a|b).*"’
        from the context (Data.String.IsString source1,
                        RegexContext Text.Regex.TDFA.Regex source1 target)
        bound by the inferred type of
                it :: (Data.String.IsString source1,
                        RegexContext Text.Regex.TDFA.Regex source1 target) =>
                        target
        at <interactive>:1558:1-18
        The type variable ‘source0’ is ambiguous
        Note: there are several potential instances:
        instance Data.String.IsString
                    aeson-1.2.1.0:Data.Aeson.Types.Internal.Value
            -- Defined in ‘aeson-1.2.1.0:Data.Aeson.Types.Internal’
        instance Data.String.IsString
                    Data.ByteString.Builder.Internal.Builder
            -- Defined in ‘Data.ByteString.Builder’
        instance Data.String.IsString Data.ByteString.Internal.ByteString
            -- Defined in ‘Data.ByteString.Internal’
        ...plus 9 others
        In the second argument of ‘(=~)’, namely ‘"(a|b).*"’
        In the expression: "abc" =~ "(a|b).*"
        In an equation for ‘it’: it = "abc" =~ "(a|b).*"
Scary. Next most promising hit on ddg is bos' tutorial at http://www.serpentine.com/blog/2007/02/27/a-haskell-regular-... which says I should be able to get a list of results by specifying the context [String]:

    λ> "abc" =~ "(a|b).*" :: [String]

    <interactive>:1497:1-5:
        No instance for (Data.String.IsString source10)
        arising from the literal ‘"abc"’
        The type variable ‘source10’ is ambiguous
        Note: there are several potential instances:
        instance Data.String.IsString
                    aeson-1.2.1.0:Data.Aeson.Types.Internal.Value
            -- Defined in ‘aeson-1.2.1.0:Data.Aeson.Types.Internal’
        instance Data.String.IsString
                    Data.ByteString.Builder.Internal.Builder
            -- Defined in ‘Data.ByteString.Builder’
        instance Data.String.IsString Data.ByteString.Internal.ByteString
            -- Defined in ‘Data.ByteString.Internal’
        ...plus 7 others
        In the first argument of ‘(=~)’, namely ‘"abc"’
        In the expression: "abc" =~ "(a|b).*" :: [String]
        In an equation for ‘it’: it = "abc" =~ "(a|b).*" :: [String]

    <interactive>:1497:7-8:
        No instance for (RegexContext
                        Text.Regex.TDFA.Regex source10 [String])
        arising from a use of ‘=~’
        The type variable ‘source10’ is ambiguous
        Note: there is a potential instance available:
        instance RegexLike a b => RegexContext a b [[b]]
            -- Defined in ‘regex-base-0.93.2:Text.Regex.Base.Context’
        In the expression: "abc" =~ "(a|b).*" :: [String]
        In an equation for ‘it’: it = "abc" =~ "(a|b).*" :: [String]

    <interactive>:1497:10-18:
        No instance for (Data.String.IsString source0)
        arising from the literal ‘"(a|b).*"’
        The type variable ‘source0’ is ambiguous
        Note: there are several potential instances:
        instance Data.String.IsString
                    aeson-1.2.1.0:Data.Aeson.Types.Internal.Value
            -- Defined in ‘aeson-1.2.1.0:Data.Aeson.Types.Internal’
        instance Data.String.IsString
                    Data.ByteString.Builder.Internal.Builder
            -- Defined in ‘Data.ByteString.Builder’
        instance Data.String.IsString Data.ByteString.Internal.ByteString
            -- Defined in ‘Data.ByteString.Internal’
        ...plus 7 others
        In the second argument of ‘(=~)’, namely ‘"(a|b).*"’
        In the expression: "abc" =~ "(a|b).*" :: [String]
        In an equation for ‘it’: it = "abc" =~ "(a|b).*" :: [String]
… so something about overloaded strings is confusing the type checker probably. Since I'm not a complete newbie I know I can specify string types explicitly:

    λ> ("abc"::String) =~ ("(a|b).*"::String) :: [String]

    <interactive>:1504:17-18:
        No instance for (RegexContext
                        Text.Regex.TDFA.Regex String [String])
        arising from a use of ‘=~’
        In the expression:
            ("abc" :: String) =~ ("(a|b).*" :: String) :: [String]
        In an equation for ‘it’:
            it = ("abc" :: String) =~ ("(a|b).*" :: String) :: [String]
That's … shorter, but still daunting. If I'm a bit more experienced, I know that I "just" have to do

    λ> :i RegexContext
    class RegexLike regex source =>
        RegexContext regex source target where
    match :: regex -> source -> target
    matchM :: Monad m => regex -> source -> m target
            -- Defined in ‘regex-base-0.93.2:Text.Regex.Base.RegexLike’
    instance RegexContext Text.Regex.TDFA.Regex String String
    -- Defined in ‘Text.Regex.TDFA.String’
    instance RegexContext Text.Regex.Regex String String
    -- Defined in ‘Text.Regex.Posix.String’
    instance RegexLike a b => RegexContext a b [[b]]
    -- Defined in ‘regex-base-0.93.2:Text.Regex.Base.Context’
    instance RegexLike a b => RegexContext a b (MatchResult b)
    -- Defined in ‘regex-base-0.93.2:Text.Regex.Base.Context’
    instance RegexLike a b => RegexContext a b Int
    -- Defined in ‘regex-base-0.93.2:Text.Regex.Base.Context’
    instance RegexLike a b => RegexContext a b Bool
    -- Defined in ‘regex-base-0.93.2:Text.Regex.Base.Context’
    instance RegexLike a b =>
            RegexContext
            a b (AllTextSubmatches [] (b, (MatchOffset, MatchLength)))
    -- Defined in ‘regex-base-0.93.2:Text.Regex.Base.Context’
    instance RegexLike a b => RegexContext a b (AllTextSubmatches [] b)
    -- Defined in ‘regex-base-0.93.2:Text.Regex.Base.Context’
    instance RegexLike a b => RegexContext a b (AllTextMatches [] b)
    -- Defined in ‘regex-base-0.93.2:Text.Regex.Base.Context’
    instance RegexLike a b =>
            RegexContext a b (AllSubmatches [] (MatchOffset, MatchLength))
    -- Defined in ‘regex-base-0.93.2:Text.Regex.Base.Context’
    instance RegexLike a b =>
            RegexContext a b (AllMatches [] (MatchOffset, MatchLength))
    -- Defined in ‘regex-base-0.93.2:Text.Regex.Base.Context’
    instance RegexLike a b => RegexContext a b (b, b, b, [b])
    -- Defined in ‘regex-base-0.93.2:Text.Regex.Base.Context’
    instance RegexLike a b => RegexContext a b (b, b, b)
    -- Defined in ‘regex-base-0.93.2:Text.Regex.Base.Context’
    instance RegexLike a b =>
            RegexContext a b (MatchOffset, MatchLength)
    -- Defined in ‘regex-base-0.93.2:Text.Regex.Base.Context’
    instance RegexLike a b => RegexContext a b ()
    -- Defined in ‘regex-base-0.93.2:Text.Regex.Base.Context’
and somewhere in that long list of instances I see that although [String] is not an instance, [[String]] is, so finally I see I can do

    λ> ("abc"::String) =~ ("(a|b).*"::String) :: [[String]]
    [ [ "abc" , "a" ] ]

Of course, the end result looks nice in the code, but it takes forever to discover the API when you're new to the system (and the docs have next to no examples).


Most Haskellers I know think that library is too complicated—it's trying too hard to make Haskell look like Perl. I haven't seen it used in serious code and I never use it myself; chances are, if I want to parse anything even remotely complicated, a library like Parsec is a better bet.


It is daunting and those errors are very hard to follow even after you have some experience with them.

That said, there's a component of "you are using it wrong" on your problem. On the middle of some code, when your data is already in well typed variables, you just write:

    if text =~ "(a|b).*" then trueVal else falseVal
and it just works. It doesn't even matter if text is a String, Text, ByteString, or whatever. Also you just write:

    putStr . concat $ text =~ "(a|b).*
and again, it just works. It does not matter that this is a completely different usage.


The trick there is to use an explicitly polymorphic instance, which can "decide what it should have been all along" later on. I've written about the idea: https://eighty-twenty.org/2015/01/25/monads-in-dynamically-t.... (OO folklore has long had some form of this idea; in limited forms, you see it in Smalltalk images stretching back decades.)


Stuff like this makes me wonder why: a) Haskell isn't used more widely b) isn't taught in school more. It seems it's a language that's intellectually stimulating and shows how much a good language design can help with programing work.


This is just my opinion, but as someone who has actually worked full-time with Haskell (though admittedly not since 2015), I think a lot of the problem is due to Haskell's awful error messages and, for most of its life, awful tooling.

When a newbie starts using Haskell, they are immediately put off by the fact that the messages are...esoteric to say the least; weird things of `result a0, expected [b0]`, can be offputting, and make things like F# a much more attractive option.

Also, I feel like, until Intero, editor integration with ghcmod was incredibly difficult to set up, and fairly easy to break, to a point where I eventually uninstalled it.

Cabal, while certainly interesting, would end up in weird scenarios where dependencies wouldn't resolve correctly, resulting in "cabal hell".

Stack has greatly improved on all of these points, and hopefully Haskell catches on a bit more, because despite my criticisms, I do think it's a fantastic language and platform.


Definitely the tooling is still a big problem, but there has been great improvements lately!

- HIE (Haskell IDE Engine) finally got some steam behind it, and has been chugging along for some time now, being very usable and targeting LSP!

- Intero (as you mention)

- ghcid is a nice lightweight alternative

I personally feel stack + hie is a very good setup, and can be used in VSCode, Atom, Neovim and emacs via LSP clients.

Error messages are slowly improving. GHC 8.2 got nice highlighting of the location of the error and some colouring, bringing it just a little further. I personally like how nice Elm is, but then again it has the advantage of a much simpler type systems, which significantly simplifies error handling.


You just highlighted how out of practice I am with Haskell...I need to fix that! I think I will play with HIE this weekend.

I definitely think Stack is excellent, and greatly reduces the cost-of-entry for new devs, and it's only getting better, so Haskell is far from hopeless.


Absolutely agree - the difference between the error messages in Haskell and its derivative Elm is night and day (see http://elm-lang.org/blog/compiler-errors-for-humans , http://elm-lang.org/blog/compilers-as-assistants ).


My experience is that it is really difficult to estimate runtime performance and memory requirements of Haskell programs. Not that it is impossible, but the straight-forward solutions often blow up in memory.

Furthermore libraries vary immensely in quality and for beginners it is hard to pick the right one for your job. Unfortunately most of them are also documented quite ... sparse.

It is also non-trivial to get a good web-stack up and running and there are many competing solutions using different architectures and with a different stance on type-safety. Their poor performance in benchmarks like https://www.techempower.com/benchmarks/ does not help either.

I don't want to dismiss Haskell and love the idea of FP especially using advanced type systems to get rid of possible bugs before runtime, but using it for specific tasks is not always straight-forward compared to more established languages.


I was a teaching assistant in courses teaching Haskell, and I found that most students didn't want to be intellectually stimulated in this way...


I'm not surprised - maybe it's just too early. I assume you're talking about undergrads. I didn't get interested in type theory until I'd been a programmer for many years. I think a few students will be immediately drawn to stuff like this but others will need to learn and use a few programming languages to appreciate the finer points.


I guess that's in line with a lot of CS graduates I see at work.


AFAIK(according to my FP teacher): a) Too complex compiler and lazy nature of haskell - generated code is hard to debug and optimise. Small changes to code might change how thing's gonna be executed so you lose control, where in OCaml it's quite straightforward and you can guess how generated code will look like. b) We had quite good intro at our university.

I like how succint haskell/ML syntax is. But I prefer practical work and controll so I personally plan to learn/use either F# or Reason/Ocaml.


Minor clarification that doesn't pertain to the general premise of the article:

> However, if we want to use it on user-defined types, we'll need to implement the Ord class manually:

For user-defined types, you can usually add "deriving Ord" to the definition, and the compiler creates a default implementation for you. You might want to implement it manually if the specifics of how it works are important to the application or if the type contains a field that GHC can't automatically derive Ord for (like a function).

Usually, for most types I use "deriving (Eq, Ord, Show)" unless I have a good reason not to.


This is one of the big things I miss from implementations of ad-hoc polymorphism in dynamically typed languages - for example protocols in Elixir or Clojure.


"This is a fairly unique and cool aspect of Haskell, especially if you come from the C++ world".

That's not true at all - what Haskell has here that C++ or D don't is type inference so that explicit type annotations aren't needed. The read function in C++ would be

    #include <string>
    #include <iostream>

    using namespace std;

    template<typename T>
    T read(const string& str);

    template<>
    int read(const string& str) {
        return stoi(str);
    }

    template<>
    double read(const string& str) {
        return stod(str);
    }

    template<>
    string read(const string& str) {
        return str;
    }

    int main() {
        cout << read<int>("2") << endl;
        cout << read<double>("3.14") << endl;
        cout << read<string>("foobar") << endl;
    }


The article doesn't cover this but haskell's return type polymorphism is strictly more powerful because it can be used in cases where you don't know the type at compile time. Such a situation can occur with recursion on nested data types or with higher rank types.

Now, i bet with some hackery one could emulate this too, but I don't expect it to be legible.


As a previous C++ and Erlang programmer, polymorphism on return type was the hardest thing for me to internalize. Like, I could repeat the words, but I didn't "grok" it for the longest time. I wish there was some standard way of emphasizing this. (Might help with the Monad tutorial jungle, too, as that's the implementation linchpin)


I think this is just regular type unification in Hindley-Milner type systems?


It’s more than just HM type inference. This relies on particular on ad-hoc polymorphism (many languages with HM type inference only have parametric polymorphism).


Is it possible to write functions using this, without also using the polymorphic functions already in place in Haskell?


The library functions used in this article are all written in Haskell, not using any particular magic. So yes, it was possible to wrote those functions, and it's also possible to write your own. Return type polymorphism is just as accessible to "normal" Haskell programmers as to the people who implemented its standard libraries. It's not an internal trick of the implementation the way some other functions (for IO, in particular) are.


As far as I can tell, however, the article gives no examples of this. All the functions that are written are based on "read" and "mempty" and similar. I'm having issues understanding how to write these functions "from scratch".


* define a typeclass with a function generic in its output

* implement the typeclass for relevant types

* invoke the typeclass function in disambiguated contexts

    class Zero z where
      zero :: z

    instance Zero Int where
      zero = 0

    instance Zero Float where
      zero = 0.0

    instance Zero [a] where
      zero = []
example usage:

    let v :: Int; v = zero
or

    let v = zero :: Int
or pass it to something which expects an Int, disambiguating it, e.g.

    > Data.Char.chr zero
    '\NUL'


You can browse the Haskell sources to find Haskell implementations of them.

Here is the implementation for mempty for the "Sum" type family mentioned in the article:

    mempty = Sum 0
https://github.com/ghc/ghc/blob/d987f71aa3200ce0c94bc57c43b4...

And here it is for lists:

    mempty = []
https://github.com/ghc/ghc/blob/d987f71aa3200ce0c94bc57c43b4...

It's harder for "read" because there is a tangle of auxiliary functions. But for Int, for example, it's essentially the same function you would call "atoi" in C: A function that converts a string like "-123" into the integer -123, or signals an error if the input string is not valid.

But I think there may be a very important misunderstanding here. Neither you nor Haskell define THE read function. If you define a new type, you can define a new read function for this type. So if you define some type MyOwnT, you will define a read function of the concrete type "read :: String -> MyOwnT".

Nobody ever defines a "master" generic function "read :: String -> a" that can return any type! There are many concrete read functions, and the compiler chooses, based on type inference and type hints and sometimes type information at runtime, which concrete one to use at that particular call site.

Again: There is no single magic function that knows how to fabricate an instance of any type from a string. There is only a family of non-magic functions that each know how to fabricate an instance of a specific type.


The interfaces are defined at the modules Text.Read and Data.Monoid, in the standard library. You just take the interface there and implement it. For example, the Maybe instance of Monoid is something like:

    instance Monoid Maybe where
        mempty = Nothing
        mappend Nothing b = b
        mappend a _ = a
Read is bit hard to implement. But it's also not something out of this world.


Very impressed by Eli's site. Has taught me a lot and introduced me to many interesting subjects.




Consider applying for YC's first-ever Fall batch! Applications are open till Aug 27.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: