> when your program compiles it will be guaranteed to behave correctly
This is only true if you give your program a strong enough type, and giving your program a stronger type may make it harder to write.
For instance, here is a 14-line version of insertion in red-black trees, due to Okasaki:
data Color = Pink | Gray
data RBT a = Emp | Full Color (RBT a) a (RBT a)
insertRB x s = makeBlack (ins s)
where ins Emp = Full Pink Emp x Emp
ins (Full color a y b) | x < y = balance color (ins a) y b
| x == y = Full color a y b
| x > y = balance color a y (ins b)
makeBlack (Full _ a y b) = Full Gray a y b
balance Gray (Full Pink (Full Pink a x b) y c) z d = Full Pink (Full Gray a x b) y (Full Gray c z d)
balance Gray (Full Pink a x (Full Pink b y c)) z d = Full Pink (Full Gray a x b) y (Full Gray c z d)
balance Gray a x (Full Pink (Full Pink b y c) z d) = Full Pink (Full Gray a x b) y (Full Gray c z d)
balance Gray a x (Full Pink b y (Full Pink c z d)) = Full Pink (Full Gray a x b) y (Full Gray c z d)
balance color a x b = Full color a x b
I munged the names a bit so it can be put in the same module as the code below: the same algorithm, but in a way that ensures that the balance conditions are met. This is 95 lines:
{-# LANGUAGE GADTs, ExistentialQuantification, EmptyDataDecls, StandaloneDeriving #-}
data Z
data S n
data R
data B
data Tree count color a where
Nil :: Tree Z B a
Red :: Tree n B a ->
a ->
Tree n B a ->
Tree n R a
Black :: Tree n lc a ->
a ->
Tree n rc a ->
Tree (S n) B a
--deriving instance Show a => Show (Tree n c a)
data High n a = forall c . High (Tree n c a)
data Set a = forall n . Set (Tree n B a)
--deriving instance Show a => Show (Set a)
data T n c a where
N :: T Z B a
R :: Tree n cl a ->
a ->
Tree n cr a ->
T n R a
B :: Tree n cl a ->
a ->
Tree n cr a ->
T (S n) B a
data H n a = forall c . H (T n c a)
lbbalance :: H n a -> a -> Tree n c a -> High (S n) a
lbbalance (H (R (Red a b c) d e)) f g = High $ Red (Black a b c) d (Black e f g)
lbbalance (H (R a b (Red c d e))) f g = High $ Red (Black a b c) d (Black e f g)
lbbalance (H (R a@Black{} b c@Black{})) d e = High $ Black (Red a b c) d e
lbbalance (H (R Nil b Nil)) d e = High $ Black (Red Nil b Nil) d e
lbbalance (H (B a b c)) d e = High $ Black (Black a b c) d e
lbbalance (H N) a b = High $ Black Nil a b
rbbalance :: Tree n c a -> a -> H n a -> High (S n) a
rbbalance a b (H (R (Red c d e) f g)) = High $ Red (Black a b c) d (Black e f g)
rbbalance a b (H (R c d (Red e f g))) = High $ Red (Black a b c) d (Black e f g)
rbbalance a b (H (R c@Black{} d e@Black{})) = High $ Black a b (Red c d e)
rbbalance a b (H (R Nil c Nil)) = High $ Black a b (Red Nil c Nil)
rbbalance a b (H (B c d e)) = High $ Black a b (Black c d e)
rbbalance a b (H N) = High $ Black a b Nil
lrbalance :: High n a -> a -> Tree n B a -> T n R a
lrbalance (High (Red a b c)) d e = R (Red a b c) d e
lrbalance (High (Black a b c)) d e = R (Black a b c) d e
lrbalance (High Nil) a b = R Nil a b
rrbalance :: Tree n B a -> a -> High n a -> T n R a
rrbalance a b (High (Red c d e)) = R a b (Red c d e)
rrbalance a b (High (Black c d e)) = R a b (Black c d e)
rrbalance a b (High Nil) = R a b Nil
hhigh :: High n a -> H n a
hhigh (High Nil) = H N
hhigh (High (Red a b c)) = H (R a b c)
hhigh (High (Black a b c)) = H (B a b c)
insertTree :: Ord a => a -> Set a -> Set a
insertTree x (Set y) =
case insert x y of
H N -> Set Nil
H (R a b c) -> Set (Black a b c)
H (B a b c) -> Set (Black a b c)
insert :: Ord a => a -> Tree n c a -> H n a
insert x y@Nil = hhigh $ insertBlack x y
insert x y@Red{} = H $ insertRed x y
insert x y@Black{} = hhigh $ insertBlack x y
insertRed :: Ord a => a -> Tree n R a -> T n R a
insertRed x (Red l c r) =
case compare x c of
LT -> lrbalance (insertBlack x l) c r
EQ -> R l c r
GT -> rrbalance l c (insertBlack x r)
insertBlack :: Ord a => a -> Tree n B a -> High n a
insertBlack x Nil = High $ Red Nil x Nil
insertBlack x (Black l y r) =
case compare x y of
LT -> lbbalance (insert x l) y r
EQ -> High $ Black l x r
GT -> rbbalance l y (insert x r)
I agree that this is only true when the language is given a strong enough type and that giving such a type may be to cumbersome. In fact languages that can do most of these things already exist (Coq and Agda for example) but they are indeed to cumbersome.
That is why we need people experimenting with new languages like Epigram, attempting to find a way of writing these programs in such a way that these strong types don't become to cumbersome.
These strong types will always be somewhat cumbersome, because they are the proof that your program is correct in certain respects, and non-trivial proofs are not free. When you hear someone use the phrase "correct by construction," you should ask you do when you accidentally construct the wrong thing. I'm afraid the answer is that you "correct by reconstruction."
This is only true if you give your program a strong enough type, and giving your program a stronger type may make it harder to write.
For instance, here is a 14-line version of insertion in red-black trees, due to Okasaki:
I munged the names a bit so it can be put in the same module as the code below: the same algorithm, but in a way that ensures that the balance conditions are met. This is 95 lines: