C is also pure. After all, C code is just data until it gets compiled and executed. When and how often this happens is irrelevant. You could write a C interpreter in Haskell, and then your C code would be as pure as anything else in Haskell!
I'm not an expert, but my understanding is that IO is not necessarily run even if an expression is evaluated that produces a value of type "IO a". That is a theoretical difference, not just an implementation detail.
Author = {Wouter Swierstra and Thorsten Altenkirch},
Booktitle = {Haskell '07: Proceedings of the ACM SIGPLAN Workshop on Haskell},
Title = {Beauty in the Beast: A Functional Semantics of the Awkward Squad},
Pages = {25--36},
Location = {Freiburg, Germany},
Year = {2007}}
that is a nice link that i will check out. (i just read their data types a la carte because of a recommendation by you on stack overflow afair just last week.)
i was wondering whether it is possible to restructure the io monad into smaller monads that only do more restricted things. is it possible to do something like that w/o breaking backwards compatibility? and if so, is it possible to really do so in the haskell standards process?
i'd really like to reason about programs by looking at a type signature like
cat :: FilePath → Term (Teletype :+: FileSystem) ()
This kind of thing is usually done with typeclasses. You have the “real” monad, but you only use it via the restricted interface allowed by the typeclass constraints. Off the top of my head:
class (MonadIO m) => MonadTT m where
ttGetLine :: m String
ttGetLine = liftIO getLine
ttPutStrLn :: String -> m ()
ttPutStrLn = liftIO . putStrLn
class (MonadIO m) => MonadFS m where
fsReadFile :: FilePath -> m String
fsReadFile = liftIO . readFile
cat :: (MonadTT m, MonadFS m) => FilePath -> m ()
In reality I don’t think I would structure it quite this way, but you get the general idea. Code to interfaces, not to implementations.
Okay, now that I have a moment, here’s how to do it with proper encapsulation. The previous example let you perform arbitrary IO with “liftIO”, which is exactly not what you want.
class MonadTT m where
ttGetLine :: m String
ttPutStrLn :: String -> m ()
instance MonadTT IO where
ttGetLine = getLine
ttPutStrLn = putStrLn
class MonadFS m where
fsReadFile :: FilePath -> m String
instance MonadFS IO where
fsReadFile = readFile
cat :: (MonadTT m, MonadFS m) => FilePath -> m ()
i was wondering more whether it is possible to structure the io monad (using algebras as in data types a la carte) or as you did here while _preserving backwards compatability_.
right now io is very broad and can be pretty much everything. it should be more fine-grained. (i still have to look into safe haskell and it's rio, but my superficial look says, it's something different.)
So basically, “main” is not an impure function. It’s a pure value representing an impure function. A monad is essentially a data type that represents computations in some domain-specific language, using data (return) and code (>>=) from the host language. The only thing special about the IO datatype is that the impure Haskell runtime can evaluate it.
If you want to do this in practice rather than as a specialised example, I would suggest using something like Conduit, rather than reinventing the wheel.
Conduit is used widely in the Yesod web framework (for example, to implement the HTTP server), and could easily be hooked up to stdin and stdout for an interactive pure program.
It's useful with Safe Haskell, which limits a few things so you can 'safely' run untrusted code. With this interface you could limit IO actions the user is allowed to run.
It can be. A value of type `IO foo` is an ordinary, pure value just like a value of any other data type. It can be passed as an argument to a pure function, returned from a pure function, manipulated using any functions that can apply to `IO` values, stored inside an arbitrary data structure, composed with other `IO` values to make new ones, etc.
What isn't pure is the process of extracting the `foo` from the `IO foo`, i.e. getting the result of the I/O action. For instance,
getLine :: IO String
is not a function in Haskell, which you can "call" in order to get a string from stdin. It is instead a value which represents an I/O operation yet to be performed. There is no way to force it to yield its result within your entirely pure program (ignoring escape hatches like `unsafePerformIO` and friends), but you can nonetheless compose it with other I/O operations which depend on its result, such as putStrLn:
echo :: IO ()
echo = getLine >>= putStrLn
The above defines a pure value representing an I/O operation which, when performed, will echo a line from stdin to stdout. The `>>=` combines the two pure values to yield a new pure value. Again, there is no way to directly cause this I/O action to be executed, which would necessarily introduce impurity. You can only ensure that it will happen later by threading it into the special I/O operation `Main.main :: IO ()`, which every program must define, and which will be implicitly performed when your program is run.
By reifying "statements with side effects" into simple, pure, first-class values of type `IO`, Haskell lets us remain 100% pure even while our program will eventually have effects on the real world outside of the machine. Haskell code is just a set of pure declarations; the side effects are deferred until after our code has already run (which conceptually happens instantaneously when the program is launched). Since our code runs "before" any side effects occur, it cannot possibly depend on their results.
A Haskell program merely sets up the dominoes in an elaborate arrangement, then looks the other way while the runtime system knocks over the first one.
I appreciate your detailed answer, but I don't understand the distinction - it feels almost like a form of solipsism to me.
Say I'm a traveler exploring the programming language universe, and I land on a programming language planet. I wish to determine whether said planet has functions that "you can call in order to get a string from stdin", or whether it "sets up the dominoes in an elaborate arrangement" and "lets the runtime system knock them down". What experiment could I perform? What program could I write that would illustrate the difference?
That's not entirely fair. Better to say that Haskell separates the pure from impure, but it has both. The straight line dependencies of the IO sequencing are still part of the Haskell program.
The domino analogy could apply to C just as well, except that there are only a few lines of code before the first domino falls.
Sorry, but I disagree. A Haskell program emphatically does not have any concept of impurity or statefulness whatsoever, except in the form of `unsafePerformIO`, which circumvents the type system, only comes into play under "extenuating circumstances", and which an ordinary program has no use or need for.
The language itself does not have any notion of sequencing or order of evaluation, nor does it need one. It inhabits a pure, mathematical universe where such things have no meaning. Given an arithmetic expression such as
(1 + 2) * (3 - 4)
it naturally makes no difference which parenthesized expression is evaluated first, because the evaluation of mathematical expressions cannot have side effects. It would be absurd to consider a universe in which such a choice could influence the result of fully evaluating the expression, because such a universe would be wholly unlike our own, to the extent that our intuitive rules of logic and causality would no longer apply.
Likewise, given the Haskell expression
putStrLn "Hello" >> putStrLn "world!"
it makes no difference which of the two IO expressions is evaluated first, because the evaluation of those values cannot have side effects. The second one may be evaluated before the first; it makes no difference. The execution of the resulting action will have side effects and impose sequencing, but there is no way to express execution or cause it to occur in the Haskell language. Execution can only happen "outside" of the context of the program itself, not in the program's abstract and pure universe, but in our own imperative and stateful one, as a side effect of the RTS.
A C program, to extend the metaphor, knocks over each domino one by one as soon as it is placed. In fact, placing the domino and knocking it over are one and the same thing. The programmer has no opportunity to leave the room before the side effects come into play, because imperative languages do not distinguish between evaluation and execution. A purely functional language not only draws that distinction, but eliminates execution from the picture entirely, leaving it to some external entity to actually do the deed that the program constructs from pure components.
From what you wrote, it sounds like what Haskell buys you is a layer of indirection - the proper analogy is not between a Haskell program and a C program, but instead between a Haskell program and a C compiler. A C compiler does not care in what order it outputs code, so long as it all gets output by the time it finishes.
If I understand this, Haskell "compiles" pure code into a sequence of side effects, and then the RTS executes them. So say I wrote a C compiler, that, given a source code, outputs a subcompiler that compiles the source code and then executes it. My C code does the same thing as it would under a traditional compiler, except more slowly - why should I want this?
It can be. But what gets intermingled with the other functions is indeed the pure language describing what IO actions to take, not their result. For instance, I'd had some good experiences returning IO actions out of STM transactions.