#include <concepts>
#include <iostream>
#include <numbers>
template<class shape>
concept Shape = requires(
shape s,
shape (&&circle)(double, double, double),
shape (&&rectangle)(double, double, double, double)) {
{ s(circle, rectangle) } -> std::same_as<shape>; };
auto Circle = [](double x, double y, double r) -> Shape auto {
return [=](auto&& circle, auto&& rectangle) { return circle(x, y, r); }; };
auto Rectangle = [](double x, double y, double w, double h) -> Shape auto {
return [=](auto&& circle, auto&& rectangle) { return rectangle(x, y, w, h); }; };
Shape auto exampleCircle = Circle(2.0, 1.4, 4.5);
Shape auto exampleRectangle = Rectangle(1.3, 3.1, 10.3, 7.7);
auto area = [](Shape auto shape) -> double { return shape(
[](auto, auto, auto r) { return std::numbers::pi * r * r; },
[](auto, auto, auto w, auto h) { return w * h; }); };
int main() {
std::cout << area(exampleCircle) << std::endl;
std::cout << area(exampleRectangle) << std::endl;
}
Of course, this is missing type-erasure, which you'd need to pass `Shape` across TU boundaries, but that's unnecessary if you're happy to use parametric polymorphism.
Yeah, I recognize the lambda expressions and not much else. How does the new stuff even interact with the older features? I can't even tell if something is recommended or deprecated anymore. Can you imagine the sheer work of maintaining a compiler for this thing? Really makes me appreciate the relative simplicity of C.
On a positive note, it looks like the standard library became amazing. It seems really well designed. Contains things that used to be preprocessor definitions, lots of very popular functionality, good symbol names and no cryptic abbreviations anywhere. Did it absorb some of boost?
In addition to the other TypeScript examples, here's one that I would actually use. Now, it's not encoding the types not as positional, but named arguments via objects. Allows me to destructure them, for added clarity.
This feels like a pattern where the naming choices obscure the intent. The ‘Shape’ type doesn’t capture the essence of a ‘shape’ - it captures the essence of being ‘visitable by a visitor that knows about shapes’ or ‘able to handle shapes’. The things which are instances of Shape are functions that accept circles or rectangles, not actual circles or rectangles. So maybe call it ‘VisitableShape’, or ‘ShapeHandler’; and instead of calling its functions ‘circle’ and ‘rectangle’, call them ‘visitCircle’ or ‘handleCircle’...
Also, you seem to have created functions and types with the same names (Circle and Rectangle) which seems dangerous.
> The things which are instances of Shape are functions that accept circles or rectangles, not actual circles or rectangles.
The naming is confusing, but it's misled you in the opposite direction. Things that are instances of Shape are functions that accept handlers of circles or rectangles. The handlers, unfortunately, are named `circle` and `rectangle`. I would prefer `onCircle` and `onRectangle` in this context, because we're lacking the context of a true `match` expression to disambiguate the naming.
> Also, you seem to have created functions and types with the same names (Circle and Rectangle) which seems dangerous.
I think this is an idiom for defining a canonical constructor for a type. It's a little funky here because they're returning Shape, not Circle or Rectangle, and the latter are not subtypes of Shape. But it mostly tracks alongside the rest of the encoding.
You're speaking at the level of the code, while I'm speaking at the level of the domain. Neither of us are wrong, but we're talking past each other.
The Shape class itself is formally the sum of Circle and Rectangle -- you can think of this as a "prefer composition over inheritance" principle, but it's still a way of relating Circle to Shape. In particular, the Circle and Rectangle factories are canonical injections of these two classes into the Shape class. The naming of the factories is a little suspect, but if the only use of Circle and Rectangle is meant to be in the context of Shape, it mostly flies.
> Now, in order for it to work, a language has to support closures, not just functions.
Objects are a reasonable stand-in for closures. Pass in the first batch of arguments to the constructor, and store them as fields. Access them from the method (which plays the role of the "returned" closure) when it's called later.
(This is effectively the same as "closure conversion" in the literature, but we're taking advantage of the implicit-receiver convention to hide the parameter by which we're passing the prepared fields.)
interface Shape<T> {
interface _Circle<T> {
T apply(int x, int y, int r);
}
interface _Rectangle<T> {
T apply(int x, int y, int w, int h);
}
T apply(_Circle<T> circle, _Rectangle<T> rectangle);
static <T> Shape<T> Circle(int x, int y, int r) {
return (circle, rectangle) -> circle.apply(x, y, r);
}
static <T> Shape<T> Rectangle(int x, int y, int w, int h) {
return (circle, rectangle) -> rectangle.apply(x, y, w, h);
}
static double area(Shape<?> shape) {
var s = (Shape<Double>) shape;
return s.apply(
(x, y, r) -> Math.PI * r * r,
(x, y, w, h) -> 1. * w * h
);
}
static void main(String[] args) {
var exampleCircle = Circle(2, 1, 4);
var exampleRectangle = Rectangle(1, 3, 10, 7);
System.out.println(area(exampleCircle));
System.out.println(area(exampleRectangle));
}
}
BTW, you can use the same encoding when you write a Promise in JS.
As with the TypeScript examples, we can make the top-level apply method generic instead of the whole class to avoid some casts. In particular, notice that the <T> inferred from calling Circle() and Rectangle() is the unmeaningful <Object>.
The downside is that Java doesn't let you implement generic methods with lambdas, which is silly, so you have to use an anonymous subclass. But at least the call-side uses stay the same.
(And then, combining the `_Circle` and `_Rectangle` interfaces gets you to a more canonical visitor pattern, at the expense of lambdas at the call-site.)
public interface Shape {
interface _Circle<T> {
T apply(int x, int y, int r);
}
interface _Rectangle<T> {
T apply(int x, int y, int w, int h);
}
<T> T apply(_Circle<T> circle, _Rectangle<T> rectangle);
static Shape Circle(int x, int y, int r) {
return new Shape() {
public <T> T apply(_Circle<T> circle, _Rectangle<T> rectangle) {
return circle.apply(x, y, r);
}
};
}
static Shape Rectangle(int x, int y, int w, int h) {
return new Shape() {
public <T> T apply(_Circle<T> circle, _Rectangle<T> rectangle) {
return rectangle.apply(x, y, w, h);
}
};
}
static double area(Shape shape) {
return shape.apply(
(x, y, r) -> Math.PI * r * r,
(x, y, w, h) -> 1. * w * h
);
}
static void main(String[] args) {
var exampleCircle = Circle(2, 1, 4);
var exampleRectangle = Rectangle(1, 3, 10, 7);
System.out.println(area(exampleCircle));
System.out.println(area(exampleRectangle));
}
}
> The downside is that Java doesn't let you implement generic methods with lambdas ...
yes, that exactly why i've chosen to make the type parametric and not the method apply, but you're right that the classical visitor pattern uses a polymorphic method (accept) not a polymorphic class.
Sure. Then, if you're set on using lambdas, I'd recommend this, to at least take away the ability to choose <T> from the caller.
static Shape<?> Circle(int x, int y, int r) {
return (circle, rectangle) -> circle.apply(x, y, r);
}
static Shape<?> Rectangle(int x, int y, int w, int h) {
return (circle, rectangle) -> rectangle.apply(x, y, w, h);
}
The boolean lets us separate how to make a decision (which we define in function1), from how to handle a decision (which we define in function2). We can always eliminate this data by bringing those two things together, e.g.
// Inline function1 into function2
void function3(...) {
FOO
BAR
if (true) {
BAZ
} else {
QUUX
}
}
// Simplify the if/then/else
void function4(...) {
FOO
BAR
BAZ
}
Of course, it's very useful to keep things separate and to pass data around (modularity, readability, etc.). My point is that the computer/language doesn't care about these things, so we can use refactorings like this to 'distill the essence' of what data 'really is' (from this one perspective, at least).
So what can we do with a boolean?
- We can discard it. Such code can always be refactored to eliminate the boolean (it's dead code).
- Pass it to some other code. We can always refactor this to remove such intermediate steps, e.g. above we inlined 'function1' to avoid passing the boolean around.
- Construct them. We do this by using the "constructors" 'true' or 'false', AKA the "introduction forms".
- Branch on them using if/then/else. This is also called "destructing", where if/then/else is a "destructor" or "elimination form".
- Simplify 'if(true) ...' and 'if(false) ...'. Here we are 'destructing a constructor', which can always be replaced by the relevant branch.
If we apply these refactorings over and over we can eventually simplify-away every boolean value in a program. There are a couple of things to keep in mind:
- Some languages have special-case boolean operations like '!foo'. Those are equivalent to functions containing branches, which can be refactored as above, e.g.
boolean not(boolean x) {
if (x) {
return false;
} else {
return true;
}
}
boolean and(boolean x, boolean y) {
if (x) {
return y;
} else {
return false;
}
}
boolean or(boolean x, boolean y) {
if (x) {
return true;
} else {
return y;
}
}
- To fully eliminate data, we need to include any 'external' code, e.g. 'inlining' external libraries, system calls, third-party APIs, etc. For simplicity I'll just assume we're using a single program in a single language, but the principle still holds in general.
Note that inlining intermediate steps and removing unused parameters have nothing to do with booleans themselves; they apply to all forms of data. Hence the 'essence' of booleans is the simplification of 'if(true)...' and if(false)...'.
We can do this with booleans since they are a "sum type", i.e. there are a few distinct forms of boolean (namely "true" and "false"). This is similar to the article's "Shape" type, which has a "Circle" form and a "Rectangle" form.
OOP languages don't tend to support sum types directly; instead we can 'fake it' using subtypes. Here's an example, where I've implemented the simplification behaviour using subtype polymorphism:
abstract class Boolean {
T ifThenElse(T t, T e);
}
class True extends Boolean {
T ifThenElse(T t, T e) {
return t;
}
}
class False extends Boolean {
T ifThenElse(T t, T e) {
return e;
}
}
There's a slight wrinkle here, since calling 'ifThenElse(foo, bar)' will execute both 'foo' and 'bar'; we'll just get one of them returned back to us. If this is a problem, we can wrap our branches in zero-argument functions, pass those to 'ifThenElse', and call whichever function gets returned. Note that this is actually how Smalltalk implements boolean branching! https://rosettacode.org/wiki/Conditional_structures#Smalltal...
Hopefully I've demonstrated that the three classed defined above are a complete implementation of booleans: we have the constructors ('new True()' and 'new False()'), the destructor 'ifThenElse(thenBranch, elseBranch)', and we don't need anything else. To drive this point home, here's how we can define boolean algebra, translated from the above:
// not(x)
x.ifThenElse(new False(), new True())
// and(x, y)
x.ifThenElse(y, new False())
// or(x, y)
x.ifThenElse(new True(), y)
This implementation of Boolean is a rather simple form of "visitor pattern": the implementation of the "ifThenElse algorithm" is spread out amongst different implementations of the method, where each subclass knows how to handle its own case, and we just dispatch on whatever Boolean object we happen to have.
The point of Church Encoding is that we don't actually need these classes and polymorphism at all: rather than wrapping up these methods into classes, defining a class hierarchy, constructing instances of these subclasses, invoking methods on those instances, dynamically dispatching on the vtables, etc.; we can just pass the functions around directly!
Switching to JS, since it has first-class functions, here's my first example again:
If we want to avoid executing both BAZ and QUUX, we could wrap them up in functions like 'function1(...)(() => BAZ, () => QUUX)()'. We could also rename these functions to simply 'true' and 'false', since they're completely equivalent to those values (although that would be a syntax error in JS).
So now we've seen that (a) booleans can be completely implemented using if/then/else (b) the behaviour of if/then/else can be implemented using subclass polymorphism (c) subclass polymorphism can be emulated by passing around first-class functions. Note that we could have skipped all the OOP stuff and gone straight to functions, but I thought it might help some people, and it's also interesting in its own right.
This also extends to sum types with any number of distinct forms (which we now know are "constructors"). The rule is simple: a type with N constructors can be represented as a function which takes N arguments, and returns one of them. For example, we could represent Colour, with Red/Green/Blue like this:
const red = (r, g, b) => r;
const green = (r, g, b) => g;
const blue = (r, g, b) => b;
function printColour(c) {
console.log(c("Received red", "Given green", "Bestowed with blue"));
}
There are two other sorts of data to consider:
"Product types" contain multiple sub-values, e.g. 'Coordinate(x, y, z)'. For these, the rule is that we call the corresponding argument as a function, passing it the sub-values as arguments. For example:
const mkCoord = (x, y, z) => f => f(x, y, z);
function showCoord(c) {
c((x, y, z) => console.log(
"x: " + x.toString + " y:" + y.toString + " z: " + z.toString
));
}
const origin = mkCoord(0, 0, 0);
showCoord(origin);
Note that I'm using raw JS numbers and strings for simplicity; we can implement these using Church Encoding (e.g. using 64 booleans to emulate a double-precision float), but it would be tedious for this example.
We can combine sums and products, i.e. each constructor can contain sub-values. This is what the "Shape" example from the article does. We just need to combine the two rules: take N arguments (once for each constructor), and call the appropriate argument with the sub-values. For example:
const circle = (x, y, r) => (ifCirc, ifRect) => ifCirc(x, y, z)
const rectangle = (x, y, w, h) => (ifCirc, ifRect) => ifRect(x, y, w, h)
const unitCircle = circle(0, 0, 1);
const unitSquare = rectangle(0, 0, 1, 1);
function drawShape(s) {
s(
(x, y, r) => console.log("Circle of radius " + r.toString),
(x, y, w, h) => console.log("Rect of area " (w * h).toString)
);
}
The final possibility is recursive data, where the sub-values can be of the same type we're defining. The article uses a binary tree as an example, but a list is easier. In OOP:
abstract class List<T> {
U reduce(U init, Function<U, Pair<U, T>> f);
}
class Nil<T> extends List<T> {
U reduce(U init, Function<U, Pair<U, T>> f) {
return init;
}
}
class Cons<T> extends List<T> {
private T head;
private List<T> tail;
public Cons(T head, List<T> tail) {
this.head = head;
this.tail = tail;
}
U reduce(U init, Function<U, Pair<U, T>> f) {
return this.tail.reduce(
f(new Pair<U, T>(init, this.head)),
f
);
}
}
With first-class functions (this might be slightly off; it's been a while!):
So the trick to recursive data is that whenever we need to pass a sub-value to one of our N given functions, if that sub-value is of our recursive type (like the 'tail' of a list is another list) then we first call that sub-value with the same N functions we've been given. That way, the structure 'reduces' down to a single result.
Here's a slightly different Java example, for variety in the examples you're getting ;) I also recommend checking out the paper on object algebras [0], which takes this approach much further.
Notice that in some sense, the structure of the type is inferable from just the visitor interface. Everything else is just an explosion of boilerplate. (This is somewhat explained in the object algebras framework, as you can have multiple types accepting these visitors, just as much as having multiple visitors for the same type. In other words, this is related to the "expression problem".)
// An "external visitor", with the recursion external to the type being visited.
// The visitor is responsible for visiting each of the children.
public interface ExprVisitor1<Result> {
Result lit(double x);
Result add(Expr left, Expr right);
Result sub(Expr left, Expr right);
}
// An "internal visitor", with the recursion internal to the type being visited.
// I like these best, because the recursion can be turned into a tight loop
// (defunctionalize the continuation!) in a single place (in Expr).
// But you lose some expressivity, e.g. no short-circuiting the recursion.
// You can elaborate the recursion scheme of course, e.g.
// Result add(Supplier<Result> left, Supplier<Result> right);
// (but now it might be impossible to flatten the stack)
public interface ExprVisitor2<Result> {
Result lit(double x);
Result add(Result left, Result right);
Result sub(Result left, Result right);
}
public abstract class Expr {
private Expr() {}
public abstract <Result> Result visit(ExprVisitor2<Result> visitor);
public static Expr lit(final double x) {
return new Expr() {
public @Override <Result> Result visit(final ExprVisitor2<Result> visitor) {
return visitor.lit(x);
}
}
}
public static Expr add(final Expr left, final Expr right) {
return new Expr() {
public @Override <Result> Result visit(final ExprVisitor2<Result> visitor) {
return visitor.add(left.visit(visitor), right.visit(visitor));
}
}
}
public static Expr sub(final Expr left, final Expr right) {
return new Expr() {
public @Override <Result> Result visit(final ExprVisitor2<Result> visitor) {
return visitor.sub(left.visit(visitor), right.visit(visitor));
}
}
}
}
From the text it seems this could be done in say C#, Java or C++, which sounds really interesting.
Anyone have any examples or outlines of this in any such language for non-functional plebs like me?