Right. I don’t have well-developed enough Haskell taste to tell whether this approach leads towards a more elegant or alternatively beneficial way of representing this kind of scenario than matching over a discriminated union type.
My takeaway from the article is that you almost always want a proper sum (discriminated union) type, but if you find yourself in a world without (hello, Java), here's a way to encode them precisely. As the OP notes, though, this still relies on having generics in your language, so you can't even perform this encoding if you don't have generics.
For a counterpoint, take a look at the relationship between object algebras and final tagless style [0]. It goes well beyond the basic visitor pattern, but gives you a lot of expressivity in exchange.
This is exactly right. The relevant quote from the article is this:
> The reason we care about Church-encoding is because not all programming languages natively support sum types or recursion (although most programming languages support product types in the form of records / structs).
> However, most programming languages do support functions, so if we have functions then we can use them as a “backdoor” to introduce support for sum types or recursion into our language. This is the essence of the visitor pattern: using functions to Church-encode sum types or recursion into a language that does not natively support sum types or recursion.
That's contrived. If you're in Java, you would use a visitor pattern.
It is weird to given an example in Haskell of something you don't need to do in Haskell, but would theoretically want to do in some other language, but actually in these other languages you'd do something different.
I'm not sure I follow your point, because the thing you don't need to do in Haskell is precisely the visitor pattern, and you don't need to do it in Haskell because you have sum types; but you don't have sum types in Java, so of course you'd use the visitor pattern -- you have no other option.
I meant that you wouldn't use a Church encoding workaround for a visitor pattern. (That's what you seemed to mean by "here's a way to encode them precisely.") Instead, you'd use the visitor pattern in the normal way. So the Church encoding workaround is useful in neither Haskell nor Java.
The point of the original article is that the Visitor pattern is the Church encoding of sum types. It sounds like you see a stronger difference between what the article describes and what I understand the Visitor pattern to be. There might be light differences in how you organize various elements (such as grouping all the handlers into a single interface or taking them as individual parameters), but I see those as all within the same overall Visitor aegis.
The core of the Visitor pattern is a CPS transform, where you provide the value the code to run "next". Because this puts the sum type on the input side instead of the output side, you can distribute the types and get a pair of functions instead of a sum of values. Is there another difference I'm not accounting for?
I’ve seen the visitor pattern used mostly in compilers where there might be 50 cases in the AST for a compiler for a production language. So when looking at how the code is written for a visitor pattern, I think about how the code might look in a compiler if you had a lot of cases. Does it scale? The amount of boilerplate you need matters because it’s multiplied by 50.
The examples I’ve seen of Church encoding don’t look like code I’d want to read or write. Who wants to write a function that takes 50 functions as parameters? At least use a struct of functions. The struct serves as a vtable, so you might as well use an object-oriented language’s built-in vtables.
(Although I suppose in C, you’d need to implement the vtable yourself. Structs of functions are how OOP is commonly done in C.)
I understand how mathematically they’re equivalent but that’s ignoring practical differences in how they appear on the screen. It was neat when in Structure and Interpretation of Computer Programming they showed how you could implement data structures using only functions but that doesn’t make it a design pattern you want to use, it’s just trivia.
> Who wants to write a function that takes 50 functions as parameters? At least use a struct of functions. The struct serves as a vtable, so you might as well use an object-oriented language’s built-in vtables.
Okay, I think I see where we're missing each other. Church encoding doesn't have to be an all-or-nothing -- you don't have to commit to only function types and no other types. It's still useful when you apply it in localized contexts.
The key idea with the visitor pattern is that `A + B` is not a type you have in most languages. You can get an equivalent with generics and functions, but that doesn't mean you can't use products if it makes sense. The equivalent is an object with a method `<T> T visit(Visitor<T> v)`, where `Visitor<T>` is a pair of two methods, `T onA(A a)` and `T onB(B b)`.
Sure, you can always replace `(X, Y) -> Z` with `X -> Y -> Z` to get rid of even the product type, but that's not what anyone is proposing fundamentally here. It's just currying; it's not essential to the encoding being discussed.
The original blog post is using Haskell norms, and in Haskell, currying is very normal. But that doesn't mean it's a necessary ingredient of the demonstration. The key idea is to use a continuation-passing transform, to move the `A + B` value into the negative position of a function type (`forall T. (A + B -> T) -> T`), and then distribute over the function type so you're not relying on having a primitive sum type at all (`forall T. (A -> T, B -> T) -> T`). The extra step of currying (`forall T. (A -> T) -> (B -> T) -> T`) is unnecessary; it's just common in Haskell.
As you say:
> Who wants to write a function that takes 50 functions as parameters? At least use a struct of functions.
That's what `(A -> T, B -> T)` is. I suppose it could be confused for an argument list, but in Haskell, functions only take one argument. The type `(A -> T, B -> T)` is literally the type of a pair of two functions. It's isomorphic to a record type like `data Handlers { onA :: A -> T, onB :: B -> T }` -- you're just changing the indices from 0,1 to onA,onB.
If you ignore the details of implementation by treating isomorphic code refactorings as equivalent, it all looks kind of the same and there is no new technique here? In Java you would notice that a struct of functions can be replaced with methods and inheritance, and we’re back to the visitor pattern.
I guess maybe it’s that you could implement the visitor pattern in Haskell using a record that contains functions. But that’s a straightforward vtable implementation, and you wouldn’t really want to anyway.
> I guess maybe it’s that you could implement the visitor pattern in Haskell using a record that contains functions. But that’s a straightforward vtable implementation, and you wouldn’t really want to anyway.
On the contrary: typeclasses are exactly "records that contain functions", and Haskell desugars typeclass constraints to vtable-passing (if it can't otherwise inline the typeclass for known call sites). Pushing in this direction leads to the final tagless representation, which compares favorably with object algebras, a mild generalization of the idea of the Visitor pattern. [1]
I maintain that the core idea here is the replacement of `A + B -> T` with `(A -> T, B -> T)` to erase the sum type, and the use of the CPS transform from `A + B` to `forall T. ((A + B) -> T) -> T` to get the sum into a negative position in the first place. How you actually organize the resulting handler functions is a far smaller distinction. Even the Gang of Four is clear that their patterns are islands within a broader spectrum, and that each pattern has related variants based on the same idea. Currying (`(A, B) -> C` vs. `A -> B -> C`) is a separate and orthogonal idea from CPS, and I just don't see the value in including a choice of currying style in the Visitor pattern when it's an inessential flavoring.
> it all looks kind of the same and there is no new technique here?
Yes, that really is the point. The OP is explaining how these two different things in two different paradigms are actually the same thing, under the hood, and illustrates exactly how that's the case. It illuminates precisely where the Visitor pattern is useful: anywhere you'd use a sum type. Because both model closed families of options.
I find it fascinating that while we mostly understand each other, it seems like we still disagree on what sort of knowledge is useful. There is a weird gap here.
It seems like this is starting with something I basically knew? The visitor pattern in an object-oriented language is used for same things that a sum type and pattern-matching are used for in a functional language.
And then re-explaining it using a lot of mathematical jargon, which makes it considerably more obscure than the more hand-wavy explanation. And then saying "isn't that useful?"
But I was happy just informally knowing that they were equivalent, and I would be happier explaining this to someone else informally, using examples of the same code written in Java (say) and Haskell. I don't feel like the mathematical jargon helped? I guess it's sort of neat that you can explain it that way.
(Thanks for continuing to engage on this. It's a fun conversation :) )
Yes, I suspect the blog post is more interesting if you already have a theoretical background, and are interested in the various ways that theory plays out in applications. I certainly doubt the post was written with the HN audience in mind in particular.
For me, theory is like a compression algorithm for knowledge. The more I can interrelate things, the less I have to duplicate the commonalities, and the more I can focus on remembering (and judging based on) the differences. So I get a lot out of this kind of thing.
(Personally, I use these kinds of transforms all the time -- especially "defunctionalize the continuation", which is lovely for mechanically making a simple recursive algorithm into an iterative one more suitable for something like Java. The theoretical background makes me more comfortable going between different representations and keeping track of precisely what is changing and what is being held fixed.)
I guess it's more about fluency in changing code than fluency in reading or writing code. Dynamics rather than statics. I think I can appreciate that if you're just looking at Visitor or sum types in isolation, it's not worth getting into the weeds over. But it's comforting to me to know that if I need to change various parts of a system, I know exactly what and where my degrees of freedom are. I can redefine the module boundaries one step at a time, and relatively smoothly and slowly shift the mass of the codebase around.
I agree that the "defunctionalize the continuation" refactoring is neat and non-obvious.
My feeling about refactorings is that you can use a tool that does it automatically (preferred, when available) or you can do it by hand (when not). I'm not sure I need to be thinking abstractly in math to do the hand-refactoring though. If I know where I'm starting from and where I want to end up then I can informally pattern-match. But I guess it's a more intuitive approach and it sounds like you prefer to think about it a different way.
Refactoring in small mechanical steps is generally less risky, though good tests can also reduce the risk.
Possibly one difference is that in Haskell, you're close to thinking in math already.
We got Streams that don't short-circuit properly and Optionals that throw NullPointerExceptions, what creative fuck-ups do you reckon we'll get with sum types & pattern matching?