He does a wonderful job of taking very dense mathematical notation and explaining it in ways that anyone can understand. He derives the basic concepts of the lambda calculus from the ground up using Python. Super fun to follow along with.
That looks interesting, although it is 3 hrs. I like this ~45 minute intro to the concept, which seems a little more Turing-accesible:, in that it shows how you can implement addition and multiplication in the system, which is a lot:
The number representation proposed by the author is one I've never seen before:
|0| = λx. x
|n+1| = λ_. |n|
In other words, 0 is identity, and successor is the const function. This makes the predecessor function easy to define. Just apply Succ n to anything to get n back. The article fails to answer the most crucial question about these numerals though:
Can we write a function that tests if a numeral represents 0?
I.e. a function f such that
f |0| = True
f |n+1| = False
Equivalently, we can ask if it's possible to convert these numerals to Church numerals, the standard number representation in lambda calculus.
If one cannot, then these numerals don't seem to be useful in any way, and don't deserve to be called a number representation.
This reminds me of someone proposing to represent the number n in combinatory logic with basis {S,K} as simply S^n K, i.e. using K as Zero and S as Successor. That one does turn out to be convertible to Church numerals [1].
I'm pretty sure it's impossible to write a function that tests for zero for these numerals. Necessarily, any such function f(|n|) would have to expand at some point into a toplevel |n|(args...) with some number of args. This call must be toplevel, not as an argument to another function, because otherwise it is lazy and not executed. The number of args also cannot be infinite, since that would require an infinitely large program.
The expansion into |n|(args...) must happen before we know what n is, because in the lambda calculus we can't know anything about a function without calling it. Since it happens before we know what n is, then the number of args will necessarily be the same for all n. When n > numargs, |n|(args...) ignores all its arguments and halts, and therefore f cannot be a useful test for zero.
> Necessarily, any such function f(|n|) would have to expand at some point into a toplevel |n|(args...)
Actually, this is not true. f |n| could expand into \x. \y. (|n| args...) instead, where the args contain x and y. But the rest of your argument still applies to the application of |n|.
You're right, that was imprecise of me. But if F = (\x. \y. (|n| args...)) is equivalent to True or False, then it is also equivalent to (F True False), which brings us back to a toplevel (|n| args...).
Another slight correction/expansion is that (|n| args...) = n - numargs when n >= numargs. This happens to coincide with False when n = numargs + 1, so it would have been better if I had said "when n > numargs + 1".
It is interesting that in the linked document exercises 4 and 10 ask you to implement addition for those kinds of numerals. Can you think of a way doing this without checking for zero? (I know that sometimes there are unsolvable exercises in the literature, but the author gives this routinely as an assignment it seems [1].)
This PDF is an exerpt from the book Programming Distributed Computing Systems: A Foundational Approach by Professor Carlos Varela. He was my advisor at RPI, and one of the most pleasant professors I had the pleasure of interacting with during my time there. His classes were notoriously hard, but I enjoyed them very much. The full book includes very concise walk thoughs of various other less well known calculi and theoretical distributed programming models (in particular, the AMST lambda calculus actor model). If you like language or concurrency theory I highly recommend picking up a copy.
> The normal order sequencing combinator is: Seq = λx.λy.(λz.y x) where z is chosen so that it does not appear free in y.
> This combinator guarantees that x is evaluated before y, which is important
in programs with side-effects
This part didn't make sense to me. I can apply Seq to x = Ω = (λx.x x)(λx.x x) and y = (λa.λb.b) and z = λx.x, resulting in Seq x y z = y x = λb.b, which leaves x unevaluated.
A sequencing operator cannot be defined within the lambda calculus, which has no notion of side-effect; it must be a function defined in a runtime, i.e. in an implementation of lambda calculus. An example is the function seq in Haskell.
That reminds me of “THE Ohio State University”, contrasting with other Big Ten schools and their common names (Illinois, Indiana, Iowa, Northwestern, Purdue,…) where the THE is often laughed about as hust funny for the reason you cite.
Yes, I went to THE university of York. How dare the other one have the temerity to think it was THE university of York. No no, it's a jumped up teacher training college of Ripon and York St John, made university, just A university of York, not THE.
It's quite funny. In the Big Ten unversities (american midwest) it's too late for Ohio State to undo the ongoing humor of "THE Ohio State" because everyone already for decades mocks it as funny and won't let it go.
In university, I had fun actually applying lambda calculus. Here’s a very silly python script.
# a factorial function written entirely with just single-argumented lambdas
# and calls
fac = ((lambda p: p(p)(lambda q: lambda n: ((n(lambda e: lambda e:
lambda a: a)(lambda i: lambda l: i))(lambda d: lambda c: c)
(lambda m: (lambda m: lambda z: m(n(z)))(q(m)))((lambda c: lambda x:
n(lambda g: lambda h: h(g(c)))(lambda u: x)(lambda u: u))))))
(lambda s: lambda q: lambda w: q(s(s)(q))(w)))
# for converting between church numerals and python numbers (you need
# a few python functions and operators to manipulate them)
natural = lambda c: c(lambda x: x+1)(0)
church = lambda n: reduce(lambda x,y:
(lambda n: lambda f: lambda x: f(n(f)(x)))(x), range(n), lambda f: lambda x: x)
# test!
# fac(10) is about the highest number computing in a few seconds on a core 2 CPU
print natural(fac(church(10)))
It was an attempt to find the simplest possible mathematical system that was universal, i.e. could compute any function. It turns out that the answer is that function application by itself is universal (if functions are first-class entities). This was a surprise at the time, and it still generally surprises people today when they first learn of it.
There was a concerted effort in the late 19th, early 20th century (perhaps earlier too) to mechanise computation i.e., reducing it to a pure symbolic manipulation. There were obvious benefits, a famous one being Enigma Machine that was used to (successfully I think) break the German code during WW-2.
On the philosophical side a parallel and overlapping effort was going on to figure out if mathematics could represent all possible truths, again as symbolic manipulation system. Bertrand Russel's magnum opus Principia Mathematica [1] was one such famous work towards it. Kurt Gödel then made a breakthrough when he proved that such a system is impossible; i.e., a system can either capture all the truths or it can be consistent but not both[2]. Put differently, any mathematical system capable of representing all the truths will necessarily contain contradictions within it.
Now coming to your question.
Lambda calculus emerged in this milieu. Alonzo Church[3] invented one such system to mechanise computation, named ƛ-calculus. Using this system one can mechanically compute any function purely by symbolic manipulation. Later on Turing, Church's student I think, invented a totally different system named Turing Machine with the same purpose. Later on it was proved (by Church and Turing I think, but I'm not 100% sure) that ƛ-Calculus and Turing Machine are equivalent, Church-Turing thesis[4].
All these work, and more, laid the theoretical foundation for the modern computers. If we can today safely assume that computers are provably correct it's because of them.
Phil Wadler has an absolutely delightful talk where he takes us through a whirlwind tour of the history of the mathematical foundation of computers [5], highly recommended.
Anyone knows if lambda calculus can be done with the standard function notation, or with something closer to that? I understand that lambda calculus works with expressions and symbols, while standard function notation works with functions and values, still using the same notation seems natural to me, x -> e for lambda, and e(x) for application. I haven't worked it out, but I wonder what will I get if I do it?
> I understand that lambda calculus works with expressions and symbols, while standard function notation works with functions and values
There's no inherent connection. You're perfectly free to interpret lambda terms as functions on a domain, or function definitions as abstract rewrite rules.
I had one of his classes. He really knows his stuff. Spoke fast, no filler, and had an answer prepared for any question related to programming language theory. I felt a little bad though, because a lot of the material went over everyone's head (it was a required course for CS majors).
As a 100 hour, thrice restarted student pilot it is the perfect mix of physical, mental, math and procedures. The actual flying of small aircraft is physical coordination and task saturation at times. Lots of procedures and optimizations. The navigation, especially the dying radio navigation (with paper charts, too) and flight calculators is lots of geography and math and tools.
Drumming is some of that but other than motorcycles I've not found anything else that comes close to flying. I've never tried sailing but maybe that too?
That's a bit of a strange statement. The Y combinator is an expression in λ calculus. It's like saying French is even more beautiful than the phrase "nouveau départ".
The Y combinator is in a sense the heart of the λ calculus; it's a key discovery for understanding that the λ calculus is universal, which is what makes it a useful concept rather than an arbitrary bundle of rules.
The simply typed, and, especialy, the dependently typed lambda calculi are extremely versatile and powerful subsets of untyped lambda calculus which do not admit the Y combinator. Their usefulness does not at all require universality, and are, perhaps, more useful for being typed.
There are many useful things that are not "the lambda calculus". "The lambda calculus" is more widely known and important, largely because it is universal, hence Church-Turing thesis.
Of course there are other useful things. But a comment argued that the presence of Y is what makes untyped lambda calculus useful. My rebuttal is that certain subsets which (intentionally) lack Y are more useful. Applications of typed calculi abound, but one rarely sees applications of the untyped calculus. Universality is not so important, it would seem. For example, infinite computations can still be modeled in typed calculi, but one has more control over them in that setting than running on "bare metal" untyped lambda calculus.
Ok, "useless" is a slight exaggeration; non-universal calculi can be useful. But the thing that makes the lambda calculus so important and famous - the thing that makes it "the lambda calculus - is universality, and that's the ur-application that makes all of the subsequent application of calculi to mechanical computation possible.
The Y combinator is a specific fixed-point combinator in lambda calculus. It is used to express recursion in lambda calculus where direct recursion is not initially available due to the lack of named functions. The Y combinator demonstrates the power of lambda calculus.
> which is what makes it a useful concept rather than an arbitrary bundle of rules.
I cannot recall a single practical FP language based on the Y combinator for recursion. The typical approach is to extend lambda calculus with a separate recursion primitive.
The lambda calculus was never meant to be a "practical FP language"; it predates programming languages and even mechanical stored-program computers as we understand them.
What are the useful cases of the Y combinator then? (I don't say that the lambda calculus is useless, I was responding to a comment that claimed the lambda calculus is useless without the Y combinator.)
the great benefit of the y-combinator is that it permits indefinite iteration in a system that seems at first glance to not support iteration at all
so when you're faced with a formal system that doesn't seem to support indefinite iteration, perhaps because you are trying to design it to guarantee termination, a useful exercise is to try to construct an analogue of the y-combinator in it
if you can't, you may have gained enough insight into the problem to show that the system actually does guarantee termination, and maybe even show useful time bounds on it
if you can, you have usually shown the system is turing-complete, which means you can appeal to rice's theorem whenever you are tempted to analyze the behavior of the system in any way, so you can satisfy yourself with partial decision procedures that work often enough to be useful and are conservative in the appropriate way, rather than wasting time trying to find a complete solution
(also if you're designing the system you might consider either supporting iteration more directly or removing the back door)
every once in a while it's also useful in practice to be able to program a weird machine, too
> the great benefit of the y-combinator is that it permits indefinite iteration in a system that seems at first glance to not support iteration at all
This is only the case for untyped systems. If we try to type the Y combinator, we will eventually need some notion of recursion. For example, in Haskell/OCaml we can type it with iso/equi-recursive types. How many truly untyped systems we use in practice?
The Y combinator as a counterexample of termination is a great insight though!
Dynamically typed programming languages are statically untyped.
You might argue they are unityped, but that’s really a type theoretician’s job security: if a language has the untyped lambda calculus as a subset (like the python example in this discussion) it’s reasonable to call it untyped.
Emulating recursion, and more generally the whole Church-Turing thesis. The fact that you can represent mechanical computation through, well, computation - the fact that equating those concepts makes sense at all, which was such a huge innovation that today it seems too banal to even notice.
You keep answering with the same words though. My question was _where_ the Y combinator is useful, not _what_ it is designed for, which I am of course aware of.
What do you mean "where"? In Princeton? In the Proceedings of the London Mathematical Society? In the head of anyone who presumes to think about computation?
you could argue that Y is also an advanced CPS transform and a lot of patterns rely on similar idea, mainly passing the following computation as a functional parameter
The Miranda programming language was an early lazy functional programming language, an ancestor of Haskell. It compiles down to combinators, and uses combinator reduction as its evaluation strategy. This became obsolete by the end of the 1980s but it’s still neat to learn how it works. One of the primitive combinators is Y. Here is its implementation: https://github.com/pkreyenhop/miranda/blob/master/reduce.c#L...
He does a wonderful job of taking very dense mathematical notation and explaining it in ways that anyone can understand. He derives the basic concepts of the lambda calculus from the ground up using Python. Super fun to follow along with.