Haskell I/O is pure

ambrop7 · on Feb 9, 2013

C is also pure. After all, C code is just data until it gets compiled and executed. When and how often this happens is irrelevant. You could write a C interpreter in Haskell, and then your C code would be as pure as anything else in Haskell!

loup-vaillant · on Feb 9, 2013

That doesn't do monadic I/O justice. An IO value in Haskell is a first class value. Your typical C instruction is not.

kenko · on Feb 9, 2013

http://conal.net/blog/posts/the-c-language-is-purely-functio...

Chattered · on Feb 9, 2013

Unless you are conal, himself, I demand an acknowledgement!

http://conal.net/blog/posts/the-c-language-is-purely-functio...

1 minute behind kenko! :P

jeffdavis · on Feb 9, 2013

"When and how often this happens is irrelevant."

Why do you say that?

ambrop7 · on Feb 9, 2013

Because it's an implementation thing, it has no relevance from a theoretic view on the language. It's like the "compiled/interpreted" label.

jeffdavis · on Feb 10, 2013

I'm not an expert, but my understanding is that IO is not necessarily run even if an expression is evaluated that produces a value of type "IO a". That is a theoretical difference, not just an implementation detail.

ambrop7 · on Feb 9, 2013

I just made that argument up, didn't know about Conal's post, thanks for the link!

dons · on Feb 9, 2013

Looks like a reimplementation of Wouter Swiestra's work on a functional model of IO,

http://www.staff.science.uu.nl/~swier004/Publications/Beauty...

Author = {Wouter Swierstra and Thorsten Altenkirch}, Booktitle = {Haskell '07: Proceedings of the ACM SIGPLAN Workshop on Haskell}, Title = {Beauty in the Beast: A Functional Semantics of the Awkward Squad}, Pages = {25--36}, Location = {Freiburg, Germany}, Year = {2007}}

crntaylor · on Feb 9, 2013

When someone says I've reimplemented some of Wouter Swiestra's work, I can't help but feel that's a massive compliment. Thanks!

ibotty · on Feb 9, 2013

hi dons,

that is a nice link that i will check out. (i just read their data types a la carte because of a recommendation by you on stack overflow afair just last week.)

i was wondering whether it is possible to restructure the io monad into smaller monads that only do more restricted things. is it possible to do something like that w/o breaking backwards compatibility? and if so, is it possible to really do so in the haskell standards process?

i'd really like to reason about programs by looking at a type signature like

  cat :: FilePath → Term (Teletype :+: FileSystem) ()

instead of

  cat :: FilePath → IO ()

evincarofautumn · on Feb 9, 2013

This kind of thing is usually done with typeclasses. You have the “real” monad, but you only use it via the restricted interface allowed by the typeclass constraints. Off the top of my head:

    class (MonadIO m) => MonadTT m where

      ttGetLine :: m String
      ttGetLine = liftIO getLine

      ttPutStrLn :: String -> m ()
      ttPutStrLn = liftIO . putStrLn

    class (MonadIO m) => MonadFS m where
      fsReadFile :: FilePath -> m String
      fsReadFile = liftIO . readFile

    cat :: (MonadTT m, MonadFS m) => FilePath -> m ()

In reality I don’t think I would structure it quite this way, but you get the general idea. Code to interfaces, not to implementations.

evincarofautumn · on Feb 10, 2013

Okay, now that I have a moment, here’s how to do it with proper encapsulation. The previous example let you perform arbitrary IO with “liftIO”, which is exactly not what you want.

    class MonadTT m where
      ttGetLine :: m String
      ttPutStrLn :: String -> m ()

    instance MonadTT IO where
      ttGetLine = getLine
      ttPutStrLn = putStrLn

    class MonadFS m where
      fsReadFile :: FilePath -> m String

    instance MonadFS IO where
      fsReadFile = readFile

    cat :: (MonadTT m, MonadFS m) => FilePath -> m ()

ibotty · on Feb 11, 2013

you are of course right, that that would work.

i was wondering more whether it is possible to structure the io monad (using algebras as in data types a la carte) or as you did here while _preserving backwards compatability_.

right now io is very broad and can be pretty much everything. it should be more fine-grained. (i still have to look into safe haskell and it's rio, but my superficial look says, it's something different.)

Evbn · on Feb 9, 2013

I seem to recall ken shan writing about this. But I can't remember much now. I feel like it is related to "regions" modeling.

Zr40 · on Feb 9, 2013

Permalink: http://chris-taylor.github.com/blog/2013/02/09/io-is-not-a-s...

crntaylor · on Feb 9, 2013

Thanks! Stupid of me to miss that.

evincarofautumn · on Feb 9, 2013

So basically, “main” is not an impure function. It’s a pure value representing an impure function. A monad is essentially a data type that represents computations in some domain-specific language, using data (return) and code (>>=) from the host language. The only thing special about the IO datatype is that the impure Haskell runtime can evaluate it.

A1kmm · on Feb 9, 2013

If you want to do this in practice rather than as a specialised example, I would suggest using something like Conduit, rather than reinventing the wheel.

Conduit is used widely in the Yesod web framework (for example, to implement the HTTP server), and could easily be hooked up to stdin and stdout for an interactive pure program.

Conduit internally looks like this: http://hackage.haskell.org/packages/archive/conduit/0.5.6/do... - note that you get to choose an underlying monad m and run things in it using PipeM, but you could just make this the Identity monad.

jeremyjh · on Feb 9, 2013

I had an "aha" moment here - very nice little article.

crntaylor · on Feb 9, 2013

Thanks. I wrote this post with beginners in mind, so it's good to know that it's provoked an "aha" moment in at least one person.

meric · on Feb 10, 2013

Two! Is this the "correct" way to make domain specific languages? If there is more to it, is there an another text you recommend?

ibotty · on Feb 9, 2013

see https://news.ycombinator.com/item?id=5193368 for another submission.

dschiptsov · on Feb 9, 2013

What is the point of having this "mini-language"?

What would happen to purity if I hit ^D?)

IsTom · on Feb 9, 2013

It's useful with Safe Haskell, which limits a few things so you can 'safely' run untrusted code. With this interface you could limit IO actions the user is allowed to run.

millstone · on Feb 9, 2013

If Haskell I/O is pure, why can't it be intermingled with other pure functions?

tmhedberg · on Feb 9, 2013

It can be. A value of type `IO foo` is an ordinary, pure value just like a value of any other data type. It can be passed as an argument to a pure function, returned from a pure function, manipulated using any functions that can apply to `IO` values, stored inside an arbitrary data structure, composed with other `IO` values to make new ones, etc.

What isn't pure is the process of extracting the `foo` from the `IO foo`, i.e. getting the result of the I/O action. For instance,

    getLine :: IO String

is not a function in Haskell, which you can "call" in order to get a string from stdin. It is instead a value which represents an I/O operation yet to be performed. There is no way to force it to yield its result within your entirely pure program (ignoring escape hatches like `unsafePerformIO` and friends), but you can nonetheless compose it with other I/O operations which depend on its result, such as putStrLn:

    echo :: IO ()
    echo = getLine >>= putStrLn

The above defines a pure value representing an I/O operation which, when performed, will echo a line from stdin to stdout. The `>>=` combines the two pure values to yield a new pure value. Again, there is no way to directly cause this I/O action to be executed, which would necessarily introduce impurity. You can only ensure that it will happen later by threading it into the special I/O operation `Main.main :: IO ()`, which every program must define, and which will be implicitly performed when your program is run.

By reifying "statements with side effects" into simple, pure, first-class values of type `IO`, Haskell lets us remain 100% pure even while our program will eventually have effects on the real world outside of the machine. Haskell code is just a set of pure declarations; the side effects are deferred until after our code has already run (which conceptually happens instantaneously when the program is launched). Since our code runs "before" any side effects occur, it cannot possibly depend on their results.

A Haskell program merely sets up the dominoes in an elaborate arrangement, then looks the other way while the runtime system knocks over the first one.

millstone · on Feb 10, 2013

I appreciate your detailed answer, but I don't understand the distinction - it feels almost like a form of solipsism to me.

Say I'm a traveler exploring the programming language universe, and I land on a programming language planet. I wish to determine whether said planet has functions that "you can call in order to get a string from stdin", or whether it "sets up the dominoes in an elaborate arrangement" and "lets the runtime system knock them down". What experiment could I perform? What program could I write that would illustrate the difference?

Evbn · on Feb 9, 2013

That's not entirely fair. Better to say that Haskell separates the pure from impure, but it has both. The straight line dependencies of the IO sequencing are still part of the Haskell program.

The domino analogy could apply to C just as well, except that there are only a few lines of code before the first domino falls.

tmhedberg · on Feb 9, 2013

Sorry, but I disagree. A Haskell program emphatically does not have any concept of impurity or statefulness whatsoever, except in the form of `unsafePerformIO`, which circumvents the type system, only comes into play under "extenuating circumstances", and which an ordinary program has no use or need for.

The language itself does not have any notion of sequencing or order of evaluation, nor does it need one. It inhabits a pure, mathematical universe where such things have no meaning. Given an arithmetic expression such as

    (1 + 2) * (3 - 4)

it naturally makes no difference which parenthesized expression is evaluated first, because the evaluation of mathematical expressions cannot have side effects. It would be absurd to consider a universe in which such a choice could influence the result of fully evaluating the expression, because such a universe would be wholly unlike our own, to the extent that our intuitive rules of logic and causality would no longer apply.

Likewise, given the Haskell expression

    putStrLn "Hello" >> putStrLn "world!"

it makes no difference which of the two IO expressions is evaluated first, because the evaluation of those values cannot have side effects. The second one may be evaluated before the first; it makes no difference. The execution of the resulting action will have side effects and impose sequencing, but there is no way to express execution or cause it to occur in the Haskell language. Execution can only happen "outside" of the context of the program itself, not in the program's abstract and pure universe, but in our own imperative and stateful one, as a side effect of the RTS.

A C program, to extend the metaphor, knocks over each domino one by one as soon as it is placed. In fact, placing the domino and knocking it over are one and the same thing. The programmer has no opportunity to leave the room before the side effects come into play, because imperative languages do not distinguish between evaluation and execution. A purely functional language not only draws that distinction, but eliminates execution from the picture entirely, leaving it to some external entity to actually do the deed that the program constructs from pure components.

millstone · on Feb 10, 2013

From what you wrote, it sounds like what Haskell buys you is a layer of indirection - the proper analogy is not between a Haskell program and a C program, but instead between a Haskell program and a C compiler. A C compiler does not care in what order it outputs code, so long as it all gets output by the time it finishes.

If I understand this, Haskell "compiles" pure code into a sequence of side effects, and then the RTS executes them. So say I wrote a C compiler, that, given a source code, outputs a subcompiler that compiles the source code and then executes it. My C code does the same thing as it would under a traditional compiler, except more slowly - why should I want this?

jerf · on Feb 9, 2013

It can be. But what gets intermingled with the other functions is indeed the pure language describing what IO actions to take, not their result. For instance, I'd had some good experiences returning IO actions out of STM transactions.