Hacker News new | past | comments | ask | show | jobs | submit login
Haskell’s Niche: Hard Problems (cdsmith.wordpress.com)
80 points by fogus on March 14, 2011 | hide | past | favorite | 45 comments



   Haskell shines at making frequent fundamental changes.
my experience of haskell is completely the opposite of this. haskell is terrible when you want to make big changes---it's often quicker to just rewrite completely from scratch. fixing all the little inconsistencies is a real pain. want to add an extra parameter to a type class? want to remove an attribute from a data type? sure, if you picked the right abstraction from the start, this is a non-issue, but that's not how it is with a hard problem.

the type system gets in your way; this is why i now use python (colleagues have noted the same problems when using c++; we prototype in python, scale up in c for the slow bits). often when i'm fleshing out ideas or trying a new approach, i don't need everything to be consistent---just the bit i'm working on, even then i don't care too much. if things don't quite work, i'd like a quick way to find an inconsistency (often i write a consistency checker function for each class that i use when debugging, that checks types and structures).

i find haskell fundmentally useless for tackling hard (research) problems for this reason alone.


> The type system gets in your way

I think that anyone saying this suffers from a fundamental misunderstanding of how to use a type system like Haskell's. I encounter it often with people coming from C/Python who start writing typeclasses, data types etc. and only afterwards start adding functions and then struggle trying to get the right types for their functions.

Fundamentally, programming is a search problem. We are searching for a sufficiently good solution (program) in the space of all possible solutions (all possible compiler inputs). I am of the opinion that the type system can significantly help in searching for such a "good solution". In the words of Conor McBride: "Sometimes it's easier to search for good programs in the space of well typed programs, rather than in the space of ascii turds."

Personally, I start by writing down type names (just names, no implementation or anything) and function names of what I want to do with said types and most importantly what the type of said functions should be (inventing new type names and functions as I discover I need them).

Say, for example, a game would start with GameWorld type and an update function of type "GameWorld -> some events -> GameWorld".

This way I start fleshing out the skeleton of WHAT I want to do, without any regards yet as to HOW I plan on accomplishing it. Once I think my skeleton is somewhat complete I define all my functions to be "undefined" (actually, I do this as I write them down, of course). undefined has all possible types in Haskell, so it always typechecks, but throws an exception when evaluated.

Then I start replacing the undefined's with actual code that matches the signature and start implementing typeclasses/datatypes as required while writing the functions. I will occasionally compile the code to see if everything so far actually type checks.


that's how you write haskell the first time. that's fine. easy, nice, simple. you've written a lot of code; it compiles, it works, no problem.

then you realise that you want to change a data structure or a function needs some information that you want to weave through it or you used the wrong abstraction. then it's a pain in the ass.


There are two points to be made here:

1. With a static type system, you can make the key change at the data type declaration level, and then chase down the compiler type errors to figure out all the place you need to change it. You need to do extra legwork in dynamically typed languages to get the compiler to tell you this sort of info: how many times have you changed some variable name in Python and forgotten to update the reference somewhere obscure?

2. Yes, if you have some pure, functional code, and then you decide you want to have a feature that uses state, fixing things up can be a bit tedious. I think there's an interesting space here for refactoring tools that help take out the tedium. I will also note that this restriction is a good thing sometimes: you know how you add a global here and a global there and suddenly you have an unmaintainable blob of global state? Haskell asks you to "slow down" and re-think about what you actually want to do. It encourages you to use the minimum amount of language features to get the job done. This leads to more maintainable code.


I think there's a space for partial compiles that Haskell doesn't take advantage of well. Correct me if I'm wrong.

Assume that you've got a simple dependency tree on defined names in a program and you want to refactor the type of one of the leafs. The type system is great here in that it'll track those mismatches straight up to the root.

The trouble appears if you want only an experimental type change at the leaf. It would be excellent to get successful partial compiles threaded up the tree. In other words, it should be possible to make a cut in the type dependence tree in order to test a functional subset of your program under more rapid experimental changes.

Once you've iterated enough to decide on a new formulation of that branch of the dependency tree, you can continue attempting typechecking up to the root node.


One thing I've seen done here is copy paste the method into a new name (usually with a leading underscore) and then make the appropriate changes. It's not hooked up to the rest of the system (but that wouldn't have worked anyway), and you can still compile it and poke at it.


That works as long as it's a simple dependency tree like I was talking about and there is a single cut with that property. It's pretty likely you'll have convergent nodes (like changing a datatype's interface late in the game) and there's no simple cut that can let the thing compile.

I mean, sure, this can all be manually avoided, but I still think that having GHCi "compile as much as typechecks" instead of stopping before loading any symbols would be nice.


I don't really see why you'd need that. In those sorts of circumstances I'd either add the new version of the function with a new name, or rename the function and add an oldFunc = undefined line to keep the type checker happy.


I have bitten severely by 2. I think having parameterized modules can go a long way in eliminating the pain of refactoring.


haskell is terrible when you want to make big changes---it's often quicker to just rewrite completely from scratch. fixing all the little inconsistencies is a real pain.

But without a type system that finds inconsistencies for you, all you can do is hope that a) you didn't miss any or b) they come up at runtime when you exercise the code with tests. Refactoring is hard, but Haskell ensures that partially-refactored code doesn't even compile. That's a good thing, because code that is broken shouldn't be running. With Python, all you can do is cross your fingers.

I've had good success with the "make one huge change and chase the compiler errors" strategy when making major data structure changes like String -> ByteString or [] -> Vector. It's tedious, but that's what would happen in any language when you switch from the native list type to a C array.


Haskell ensures that partially-refactored code doesn't even compile. That's a good thing, because code that is broken shouldn't be running.

But when I'm experimenting I often do want partially refactored code to run, to quickly test out a possibility in a small part of a program. I'm not going to deploy it for real, of course, but I may want to type some things into the REPL before deciding that yes, that was a good change and I'll finish refactoring the whole program that way. In those cases, runtime type errors fit my working style better; I know there are broken things in the program, but I want the compiler to let me optimistically try to execute the not-broken part of the program until it actually encounters an error.

(Lisp works great for that, but is unfortunately less great when I get to the part where I'm done experimenting and do want static checks. SBCL does a little of that, but nowhere near Haskell levels.)


Meh I spend way more time thinking than I do typing nowadays. Honestly, putting more thought into what you are going to do next, instead of just banging away until it works usually alleviates the issues you mention. Experience helps with this also (not saying you are inexperienced, but just that the more experienced I become the easier it is to tackle any problem).

Also these refactor the world type changes aren't very common in my experience. Well thought out code can avoid a lot of it.

I find Haskell incredibly valuable for writing up quick solutions. Notably because I have built up a large library of useful pure functions that I have at my disposal, and they cover a large part of the domains I am working in (ever expanding of course). It really just becomes a matter of writing a bit of glue for the right pure functions.


This is a good point: people don't care about correctness sometimes! Fortunately, you can get part of the way there by stubbing out complaining segments of the code with 'undefined', which only bug out if they are run.


I guess this is true if you sit in front of the computer with no clue on how to tackle the problem at hand and have no intention of producing an actual robust program as the immediate output. If you are simply messing around with no direction, then I would agree.

Let's give some examples of hard problems first: designing a version control system, writing a complex revenue simulation model, designing an algebra for arbitrary data model customization, etc.

When dealing with such "hard" problems, I personally usually do a fair amount of thinking before typing a line of code. Once I have a hypothesis as to how the solution might look with at least a starting point for the implementation, I start fleshing it out in Haskell. This prototyping actually goes quite fast with data types and transformations leading the way and making what needs to be done fairly clear. In a short amount of time, you end up with a pretty robust (thanks to various reasons mentioned by the author and several others in this thread) initial prototype. I have a feeling this experience is fairly common in the Haskell community.

If your idea was completely rubbish, then yes, you may have to scrap it all together. Otherwise, the type system becomes your guide in making incremental adjustments to your data types and transformations. I would argue that any major change would require the same, if not more, amount of time/work in Python/Ruby/Perl if you take the correctness of your code seriously.

Also, keep in mind that Haskell has really advanced type inference, which means you don't have to write any types at all if you don't want to. You are required to write them only if you are using higher order types, but then you really ought to know what you are doing.


Let's give some examples of hard problems first: designing a version control system, writing a complex revenue simulation model, designing an algebra for arbitrary data model customization, etc.

If this is true then why is a VCS written in C (git) not left in the dust by Darcs?


I'm pretty sure there are several VCSs written in C that are left in the dust by Darcs, and keep in mind that git adoption got a head start from being promoted by some guy named Linus Torvalds. (Perhaps you've heard of him?)


I don't see your point - there are many, many things beyond the choice of language that determine the eventual success of a product.

You might want to read: http://www.paulgraham.com/disagree.html


I think a lot of this story can be explained by looking at the size of the contributor lists:

http://git-scm.com/about http://wiki.darcs.net/DarcsTeam

Git is very polished because a lot of people have put a lot of time into it.


often when i'm fleshing out ideas or trying a new approach, i don't need everything to be consistent---just the bit i'm working on, even then i don't care too much.

What I do is I simply compile/run unit tests for a piece of the program rather than the whole thing. Then, if I decide to keep the changes, compiler errors tell me every single thing I need to fix.

In python I need to hope everything that will fail fails in unit tests.


Kind of a counterpoint to Zed Shaw. Fuck unit tests and their false sense of security.


I think that's taking things a bit far.

The compiler will make sure you are returning an Int, but it has no ability to make sure it is the right Int.

Or, in more practical terms, the type system has the ability to make sure I'm returning a Reader MyConfig Int. This guarantees that a) all configuration parameters come from a single known location, b) no side effects and c) the output is an Int. But I still need unit tests to make sure the result is correct.


Ints and strings are a poor example because they have a very large domain. These features are much more useful when you're talking about more complex data structures, like the intermediate representation in a compiler. I think typechecking is most useful with complex data structures, anyway - their invariants can be built into the type.

There are other languages with the ability to constrain types to specific ranges, etc. (without having to define e.g. RInt of One | Two | Three | Four | ...), but they're still pretty obscure.


Haskell's type system only guarantees things that the type system guarantees. One thing which it does not guarantee is "does my program do what it's supposed to do". That, you need unit tests and a good design in order to feel confident about.


There's a terrible horrible no good really bad solution to this problem that I was surprised to discover one day when pointlessly experimenting with monad transformers:

Don't write type declarations.

Well, there's another part to it, too: Always open up data constructors using the record syntax. Record punning makes that bearable to use and usually briefer than typing out every argument of a big constructor. Combine that with using record updates rather than constructors wherever possible and appropriate, and you can minimise the unrelated knock-on effects of data constructor changes pretty well.

But! The point I was digressing from was, avoiding type declarations can save you so much trouble when you're making changes that affect 90% of the types and 5% of the code. That includes adding or removing parameters from type constructors or type classes, renaming types, and turning single types into typeclasses and vice versa. If you never said Array in a type declaration, and you suddenly realise you need more generic array types, you just import Data.Array.IArray rather than Data.Array and you're done. If you never said IntMap in a type declaration, and you suddenly realise you need more generic int-keyed finite maps, you can just import Data.EnumMap rather than Data.IntMap and you're done. If you never said your Foo type constructor takes two arguments in a type declaration, and you suddenly realise that it should, you change the type declaration and you're done.

Sure, when you're writing a function, you start with a type declaration, and let the arrows guide your path until you're done. And when you know you've got the meaning of something precisely right, you go ahead and fix it with a type so the compiler will know to look for errors elsewhere. But when you're not sure everything you're saying is the right thing, try to say as little as possible. That way you won't need to eat your words later.


>when i'm fleshing out ideas or trying a new approach, i don't need everything to be consistent---just the bit i'm working on

I would advise you to use ghci - a REPL for Haskell.

That way you work on some little subproblem and when you succeed you can spread your changes over complete program.

I prefer to do just that, relying on compiler in small and large.


I began learning haskell not so long ago, so i would like to ask you, for how much time have you been using haskell (also interresting for python, although it doesn't say as much IMHO).

Do you think it's really due to the language itself, or may it be a question of familiarity.

I'm really interrested. I would intuitively have the same idea about it than you just exposed, but i'm really far too newbie to haskell to be able to judge.


i'd heavily used haskell for 5 years. i was taught by some of the people who developed the original haskell specification, and the people who worked on haskell prime. i aced their haskell course (~95%. first in the year, iirc).

i've used haskell in work i was paid for, i've used haskell for real-time low-level video processing, as well as high level machine learning. i've contributed to a few open source haskell projects. the entire code for one of my research projects was written in haskell, coming to 2.5k lines of code.

you'll often find people who advocate haskell have the attitude that you "just don't get it" or "don't have enough experience" (an attitude you might find purveys some of the responses to my comment... i certainly do). i do like haskell, it's beautiful in many ways, but for large, hard problems where it's tricky to know how to develop code, i found it inappropriate. where you know what you want to write (i.e., have a specification that is solid or are willing to spend a while properly engineering things), haskell is great.


Would you then recommend prototyping in a dynamically typed language, then re-implementing that in Haskell once you know more about the solution space? Exploration first, hardening second?


I have used Haskell for research problems in the past and I have found it to result in a net improvement in development speed for me, even over Python. Part of this is that the way I think about problems maps well onto Haskell's feature set, but partly this is because I honestly do feel that Haskell lets me experiment with changes with a lot of agility.

Here's my general approach for Haskell development. I try to separate everything I do into a number of fairly small modules, each structured around a core idea. I try to keep the coupling between the modules where the work happens to a minimum; instead I have certain tiny modules that only exist as glue to provide any coupling that isn't unavoidable and inherent. All of this means that, when I'm experimenting with a big change, I usually only need to initially change one or two modules. (The whole project won't compile, of course, but those modules will.) I can then verify that things are to my liking by loading just those modules in the REPL and playing around with them; sometimes I'll write mock versions of some of the other modules directly in the REPL, just enough to experiment with. The static typing doesn't get in my way when I work this way, and it's a positive boon when I decide I like the changes and want to update the whole project to work with them; then it's very easy to see what's impacted and make the necessary changes.

The REPL, modules, and the type system work well for me to let me experiment with agility AND confidence. I fully recognize that some problem domains might make the sort of separations I do more difficult, but for my work, at least, Haskell has been a really good fit.


I find that Haskell shines at a specific kind of hard problem, namely those involving significant math. It's particularly good for problems involving abstract algebra. Why?

1) The Haskell community tends to rely heavily on math—it's part of the culture. Part of this comes from Haskell's functional nature, which favors mathematical approaches, and part of it comes from a community tradition of using more math to work around the limits of functional programming.

2) A typical Haskell library is usually an algebra defined over an abstract data type. Haskell has excellent abstract data types, and good pattern matching, which makes this a natural way to work.

3) The QuickCheck library generates random input data for functional programs, and makes sure that certain invariants always hold. For example, you can use QuickCheck to verify that (reverse (reverse xs)) is equal to xs.

4) If your problem domain has a certain mathematical structure (specifically, a topos or a closed Cartesian category), then there are some really slick ways to map it onto a Haskell DSL. See my notes at http://www.randomhacks.net/darcs/probability-monads/probabil... (PDF) for an example involving Bayesian filtering and particle systems.

5) Haskell's strong type system make it feasible to work at very high levels of abstraction without going completely insane. For ordinary CRUD apps, I actually prefer dynamically-typed languages, but when it comes to scary math, a good type system makes life a lot easier.

6) Because Haskell is so steeped in math, it tends to force my brain into "math mode" well before I reach the mathematical heart of a problem.

So I'm not convinced that Haskell is the ideal language for small DSLs. (Ruby shines, here, for almost exactly opposite reasons.) But when there's enough math involved, Haskell is absolutely a wonderful language.


Have you tried F#? If so, could you say the same things about F# as well?

Also could you explain the last point "Because Haskell is so steeped in math"?


The linked notes on particle filtering are very interesting, thanks for sharing. You've just piqued my interest in Haskell.


"Reason 1: Haskell shines at domain specific languages.

"[...] if you can embed a domain specific language into the programming language you’re using, things that looked very difficult can start to appear doable. Haskell, of course, excels at this. [...]"

Says who? Just because I can write `f(x)` as `f x` with full featured operator precedence and associativity tables doesn't mean I can introduce new syntax into the language. I can't even find a moral equivalent of ELisp's rx package for Haskell.

"[...] Lisp is defined by this idea, and it dominates the language design, but sometimes at the cost of having a clean combinatorial approach to putting different notation together."

This is either weird nonsense or the old variable capture argument. You can write crappy functions in any language too, that doesn't mean we should take away the programmer's ability to write new ones.

"Reason 2: Haskell shines at naming ideas. If you watch people tackle difficult tasks in many other areas of life, you’ll notice this common thread. You can’t talk about something until you have a name for it."

I applaud the willingness to appeal to natural language, but this isn't even close to true.

This blog post seems like an argument for Haskell's value based on specific times the author has enjoyed programming in Haskell. I call post hoc ergo propter hoc shenanigans.


"Just because I can write `f(x)` as `f x` with full featured operator precedence and associativity tables doesn't mean I can introduce new syntax into the language."

There's two definitions of "DSL" and I find mixing them to be a bad idea. I prefer to confine "DSL" to a fully-fledged actual language, with its own parser and evaluator. Haskell is just about the only language I know that makes this easy enough to actually consider as the solution to a problem.

Then there's "glorified API to look like a language", a definition I don't like but is clearly in current use, and the one used in this post. Haskell is pretty decent at that, but of course you can't escape from the fact you're using Haskell. But then, people pretty freely use the term "DSL" in Ruby where you can't escape the fact that you're in Ruby. In either case, "adding new syntax" is not actually part of the general definitions of DSL nowadays. The "adding of new syntax" is an illusion, and I guess that's part of why I don't like this definition of DSL, there's actually an element of deception in it and in practice I have found people not 100% comfortable with the base language the DSL is in often end up deceived and less effective. And either way, if you really, truly want some new syntax, Template Haskell can mostly keep up with Lisp anyhow, but not entirely. There may not be, at this moment, a precise match for the rx package already on Hackage but I believe Haskell has all the necessary functionality to create one, up to and including compile-time regular expression compilation. Though given the way Haskell works, it is also the case that compile time RE compilation is way less useful, because it's easy to make it compile only once even at run time with the way Haskell works. As is so frequently the case, what takes a macro in Lisp does not require a macro in Haskell.

"This is either weird nonsense or the old variable capture argument."

No, it's a functional purity argument. Lisp in practice has the same problems with mutation and state as pretty much any other language. Insert debate about importance of functional purity here. (That is, I'm not making the argument now, I'm just letting you know that's what was being referenced.)


>No, it's a functional purity argument.

I'm pretty sure it's a reference to the fact that combining macros can be risky; one of the risks is variable capture.


That's a subset of the functional purity problem. It's a relatively-commonly-understood one, but Haskell has taken the argument far further than most, such that that is now only a subset of a larger argument about state and mutation. The argument is not merely that you've got some state being mutated in a bad way, the argument is that the fact you've got state mutating at all is a bad thing. In this case, it's state in the compiler phase, but it's ultimately the same argument. The argument would be that combining any two things that mutate state is inferior to trying to combine two things that don't, at any level.

I'm personally still skeptical of this general argument, but I mean that in the "true" sense of skeptical. I think in a lot of ways the benefits have panned out as promised but there's some costs still swept under the rug. Rather a lot of the development ongoing even now in the core Haskell community can be viewed as various ways of lowering or eliminating the costs associated with their approach.

(For instance, the still-ongoing, but rapidly converging, changes associated with trying to create a decent string library that does not treat strings as linked lists of integers. On the one hand you can see it as normal feature work, but on the other it's a way of trying to go from having an unusually bad string story induced by functional purity to potentially having an exceptionally good one. It's this sort of work that actually keeps me interested in Haskell; I don't know of any other community doing so many actually new things to prove out their philosophy, instead of rehashing mutation-based OO in yet another way.)


Imagine Haskell lacked an "if then else" expression. Then:

   my_if :: Bool -> a -> a -> a
   my_if True  ifTrue _       -> if_true
   my_if False _      ifFalse -> ifFalse
Usage (those 2 lines are equivalent):

   my_if (x > 42) (foo x) (bar x)
   if (x > 42) then foo x else bar x
That's the basis of what we call combinator libraries. Application syntax and infix operators, used wisely, actually feel like ad-hoc syntax. Combined with lazy evaluation, they makes fully fledged macro much, much less useful.


Right, but I can add a sub-language with lazy evaluation, infix operators, and application syntax to Lisp...

I fully recognize Haskell can do a very useful subset of the features of real dynamic syntax. It's still a proper subset though.


What do you mean by "dynamic"? Syntax happens at compile time, right? Different syntax can apply to different part of the source code, but that's still static. Did I miss something?


He probably means that the syntax can vary at compile time. An example:

Actual code I wrote (for https://github.com/stucchio/Mp3FS ):

    type Mp3fsM a = ReaderT Mp3fsInternalData IO a

    runMp3fsM  f r = runReaderT f r
    runMp3fsM1 f r = \x -> runReaderT (f x) r
    runMp3fsM2 f r = \x -> \y -> runReaderT (f x y) r
    runMp3fsM3 f r = \x -> \y -> \z -> runReaderT (f x y z) r
    runMp3fsM4 f r = \x -> \y -> \z -> \t -> runReaderT (f x y z t) r
In lisp, this sort of thing could be handled by a macro.


    In lisp, this sort of thing could be handled by a macro.
So why not apples-to-apples and use Template Haskell to do it?


Because it doesn't fit my original argument, that Haskell's combinatorial approach can do much (though not all) of what Lisp macros do.


I get it. `my_if` doesn't require separate recompilation of the functions that use it. Well played.

I still see a subset of the functionality of a language with macros over sexprs as the syntax. It will take me some time, however, to come up with a good terse reply to that issue that fits in these tiny text boxes. The gist would probably be; if one is going to really change the syntax of a language, the programs that use the changed bits have to be re-parsed at some point. That's somehow tautological.


I actually agree. If you want for instance some weird scoping rule like

   ; example taken from PG's On Lisp
   ; "it" refers to the big-long-expression
   ; "aif" stands for "anamorphic if"
   (aif big-long-expression
     (foo it)
     (bar it))
then combinator libraries won't cut it. Template Haskell might, however (but its' not standard, and besides our point).

Now my point is that the subset we speak of is easier to achieve through functions with lazy evaluation. Macros are more capable, but harder to deal with. This is magnified by Haskell's paranoid type system: functions are checked in ways macros aren't.

So my guess is that neither system dominates the other.


haskell's niche: PL theory research. as a professor of mine said, "it's an incubator for PL theory". that said, are there merits to non-PL researchers learning it? i think so. for one, the functional paradigm is increasingly being integrated into the mainstream languages and the "genpop" is starting to use FP concepts to solve problems. haskell is supposedly "the purest" functional language. therefore, if you like FP, studying haskell is like tapping into "the source".

while i find functional constructs aesthetically more pleasing than imperative ones, a reason i often hear given in support of using functional languages is the promise they offer w.r.t concurrency/parallelism. in the words of the same professor i quoted earlier, "concurrency/parallelism is currently broken in most other languages. it's like building skyscrapers atop matchsticks". i believe this now. i also believe haskell's approach to be quite a big step in the right direction (STM and atomically). an outstanding paper on the topic exists here http://research.microsoft.com/en-us/um/people/simonpj/papers...

regarding haskell's type system...in my opinion, hands down the best currently in existence. yes, it's hard to work with at times. i strongly disliked it at first. yes, the type errors can be a bit cryptic at times, at least until one gets used to them. the upside? there be real magics there. and i'm not talking about devices that begin with the letter i. the majority of the time, if i can get code to type check it works flawlessly.

working with it over the last 6 months has really changed my thinking about programming languages and has actually swayed me to go further in that direction in my studies.

don't mean to sound like a fool in love, but ...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: