Foundations of Dawn: The Untyped Concatenative Calculus

jandrewrogers · on Nov 20, 2021

I’ve been a proponent of the somewhat unique utility of the concatenative calculus going on two decades now, so it is great to see a thoughtful attempt at a modern and practical implementation. Looking forward to seeing how it progresses.

andrewflnr · on Nov 20, 2021

> The first three intrinsics, swap, clone, and drop, directly correspond to the three common structural rules from proof theory... By restricting the use of these intrinsics, we can cover the full range of substructural logics.

Dang, I think you buried the lede.

smaddox · on Nov 21, 2021

;-) There's a lot of work left to do to demonstrate the value of this, but it's my primary motivation for basing Dawn on the concatentive calculus.

tromp · on Nov 20, 2021

> There will inevitably be some computations which can be more compactly or efficiently expressed in the untyped concatenative calculus than in the untyped lambda calculus, and visa versa.

What would be an example of a computation more compactly expressed in the untyped concatenative calculus than in the binary lambda calculus [1] (where compactness can be directly measured as size in bits) ?

I suppose that your programs can also be represented in binary with 3 bits encoding the 6 intrinsics plus the two symbols '[' and ']', and the end of a program could be encoded by an unmatched ']' at the end.

[1] https://tromp.github.io/cl/Binary_lambda_calculus.html

smaddox · on Nov 21, 2021

If you take a close look at that sentence, I was careful to limit the compactness claim to this pair of languages :-) I'm not surprised that there are pairs of languages where one is uniformly more compact or efficient than the other.

adenozine · on Nov 20, 2021

"Democratize formal software verification [...] perfect software."

Okay.

I like Forth too, but that's laying it on a tad thick, imo.

MaxBarraclough · on Nov 21, 2021

Formal methods have been used successfully for decades; it's not just a pipe dream. Perfect software should ideally be something like ultra-low-defect software, though (that's the term the AdaCore folks use).

There are also other projects that aim to make formal software development much easier [0][1] and of course there's SPARK Ada.

[0] https://github.com/zetzit/zz

[1] https://github.com/dafny-lang/dafny

smaddox · on Nov 21, 2021

The formal verification aspect doesn't really have anything to do with Forth, but I do expect the multistack concatentive calculus to be much more ergonomic than applicative languages for linearity, which will play an important role in formal verification. We'll have to see how it all plays out, though.

But that's my goal. I'm not really interested in writing just another programming language. Even if I fail miserably, hopefully breaking the work up into small pieces and publishing as I go will at least inspire others with similar goals.

smaddox · on Nov 20, 2021

Author here. Ask me anything :-)

kryptiskt · on Nov 20, 2021

It would be interesting to get a comparison of this calculus with Joy's (http://joy-lang.org). Is it anything you have thought of?

smaddox · on Nov 21, 2021

Joy is essentially a dynamically typed (single-stack) concatentive calculus, as far as I understand. So it's essentially an extension of this untyped concatentive calculus, with different names for some of the intrinsics.

Dawn itself will be notably different, though still ultimately based on the same core calculus.

Twisol · on Nov 20, 2021

This looks really awesome. A few questions:

1. Is the "main stack" (the unnamed one, the one your program begins with focus on) specially identifiable in some way? I note further down this comment tree that, if the "main stack" is simply whichever one happens to be in focus, there are some exciting possibilities for coroutines that are simply resumed in focus of the stack that they treat as their main.

2. Are stack names always statically known in Dawn? Put differently, can you generate fresh stack names at runtime? If so, what does that look like, and how does that impact a theoretical reduction from multi-stack Dawn to single-stack Dawn? If not, can you envision additional intrinsics for supporting coroutines without requiring the user program to virtualize stacks manually?

3. Do you have any thoughts on concurrency in general in Dawn? I'm a big fan of the mental model offered by concurrent logic paradigms (especially Saraswat's concurrent constraint programming), where concurrent systems interact monotonically over shared logical variables. I think there's a subtle and exciting alignment between that concept and your named stacks. Does this line up with any of your thoughts?

4. (A bonus) What happens when an operation is attempted on an empty stack? I assume some form of exceptional case occurs, but in the face of concurrency, would it make sense to yield execution until something else puts an element onto the stack? (Probably not, because of FIFO semantics, but it's a fun question!)

smaddox · on Nov 21, 2021

1. There's really no main stack, there's just the current stack and the enclosing stack (and the stack enclosing that, etc). It seems to work well in practice, but there might be some subtleties that I've missed, which is a major part of why I'm backing up to formalize it in the next post.

2. Yes, they're statically known.

3. It should ultimately be possible to compile Dawn to concurrent nets and therefore to evaluate concurrent sections concurrently, but I've spent very little time thinking about that. I definitely want to support traditional explicit multithreading and something like async/await M:N threading, but the details are unclear. It should be no less ergonomic or safe than Rust.

4. In Dawn it will be a type error. In the untyped calculus it's a runtime error.

Twisol · on Nov 21, 2021

> There's really no main stack

Excellent!

> It should ultimately be possible to compile Dawn to concurrent nets and therefore to evaluate concurrent sections concurrently

I'm thinking more along the lines of Multicore OCaml's approach to concurrency, which uses effect handlers to support userland scheduling without requiring explicit support from the OS. In particular, I'm not asking for different parts of a program to execute truly in parallel.

From my standpoint, concurrency has more to do with how you structure your programs -- I think of it as another view on modularity. In particular, to "evaluate concurrent sections concurrently" is something a user program would do, not something the compiler has to implement in terms of lower-level primitives.

smaddox · on Nov 21, 2021

Yes, that is definitely a possibility I hope to explore. Linearity has some interesting consequences for co-routines or async/await style code that I don't fully understand.

smaddox · on Nov 23, 2021

> It should ultimately be possible to compile Dawn to concurrent nets and therefore to evaluate concurrent sections concurrently

Correction: to interaction nets: https://en.wikipedia.org/wiki/Interaction_nets

clone_apply · on Nov 20, 2021

This is just to notice that the [clone apply] clone apply for recursion is very close to the way the lambda calculus Y combinator reduces via graph rewrites, after is translated into a graph [1].The graphical reductions lead to just a pair of two trivalent nodes, one of them a sort of fanout (clone) and another an application [2]. There is no lambda term which can be directly translated into this graph, because of the wiring of the graph.

This coincidence is curious.

[1] you can check this at https://mbuliga.github.io/quinegraphs/lambda2mol.html#y_comb...

where you can see how the term Y (\x.x) reduces. In the λ>mol textarea delete what is written there and just copy/paste

(\g.((\x.(g (x x))) (\x.(g (x x))))) input

then click the λ>mol button. Then the start button. Stop whenever you want.

[2] the graph akin to a pair clone - apply can be seen at https://mbuliga.github.io/quinegraphs/quinelab.html#howtolab

where you copy/paste this graph as written in the mol notation:

A input y x^FOE x output y

"A" is the apply node and "FOE" is a clone node (say).

js8 · on Nov 20, 2021

I like your project, I have been a fan of both Haskell and Factor, and happy to see somebody trying to unify their qualities. I like Haskell's type system and robust fundamentals, but I also like the effectively syntax-less nature of Factor, and the evaluation order being the same as read order.

I have a question: Why not base this project on Factor, which has a pretty decent existing implementation and compiler? (But few people use it..)

I think one could add better typing to Factor by creating a separate module for words (like docs and tests are now) which would describe the types of each word (and also other stipulations). Then these could be type checked separately.

As an aside, I always thought that it would be nice if a programming language could specify "stipulations", kinda like assertions but instead of having them inside word definitions (programs), have them outside (so that you could refer to them). For example, if something is a monad instance, the stipulation could specify that it has to follow monad laws. It would be then possible to automatically verify these stipulations (like QuickCheck does it, for example) or just assume they are true and use them for automated program deriving (or during compilation for optimization). A word annotated as having a certain type is just a special case of stipulation. Factor already has some stipulations like that in the form of additional word annotations (e.g. the word is free of side effects).

smaddox · on Nov 21, 2021

> Why not base this project on Factor, which has a pretty decent existing implementation and compiler?

I have some ambitious ideas that aren't really compatible with Factor, or with any other existing language.

dwenzek · on Nov 20, 2021

How this semantics is extended to handle named multi-stacks?

For instance, how is interpreted the main example of the introduction [1]?

    {fn f => {spread $x $y $z} {$z drop}
        {$y clone square pop} {$x square pop} add {$y abs pop} sub
    }

Notably, how to interpret the second use of $y that makes sense only because the former y value has been duplicated in the first $y call?

[1] https://www.dawn-lang.org/posts/introducing-dawn-(part-1)

js8 · on Nov 20, 2021

Not author, but TBH I don't see the advantage of his multi-stack proposal compared to local variables as in Factor; I think they are equivalent actually.

If Factor had full stack type inference, you could look between any two words and ask what the state the stack is in. Then if you wanted to split the word definition inside the local binding at that point (assuming this is something useful to do, I am not convinced), then automatic algorithm to do that would be just: Put the required locals to the stack manually at the end of the last word, and then define the next word as having that stack as input. As a special case, you could extract any sub-sequence that doesn't really manipulate the stack into a word.

I am also not clear what the advantage of multistacks is for side effect control. In Factor, there is already an implicit set of globals that are available at any point, i.e. all the words in the vocabulary. And there already is, on top of that, a mechanism how to restrict all callees from using certain words (for example, restrict only to pure subset, or subset with certain capabilities); this can also be formally verified. I just think that simply restricting the used vocabulary is a more powerful mechanism for restricting effects.

But I guess I will learn this in the next blogpost, looking forward to it!

Twisol · on Nov 20, 2021

> I am also not clear what the advantage of multistacks is for side effect control.

If you only ever push "effects" onto a stack, you're just building up a trace. The CALM theorem (Consistency as Logical Monotonicity) tells us that concurrent systems can coordinate as long as published information is increased monotonically (rather than destructively changed); as a limiting case, traces are the trivial representation of a series of operations as a monotonically growing set of information.

The alignment between these domains makes me really excited for the next posts in this series. I'm expecting to see a restriction on "drop" for stacks of this type (as alluded to in this post). And if any reader only ever reads a limited suffix of the stack, the runtime (or compiler!) can eagerly prune unreachable parts of the stack.

Also, this makes me very excited for coroutines in this language. I don't think the "main" stack is specially identified in any way, it's just the stack you were using before focusing on another stack. Coroutines are then managed simply by focusing on their stack before resumption; their main stack is simply the ambient one we've set up for them. (Though you may have to do some amount of stack virtualization to avoid needing to generate new stack names; the multistack reduction I posited in a nearby comment relies incidentally on all stack names being statically known.)

smaddox · on Nov 21, 2021

> If you only ever push "effects" onto a stack, you're just building up a trace.

That's an interesting concept, but not my plan for capabilities/effects. Instead, I'll be relying on linearity and qualified types.

> Also, this makes me very excited for coroutines in this language.

I am interested to see how they work out, too! I've only done a small amount of brainstorming about how they work. I'll likely need to give it some more thought as I formalize the multistack semantics.

smaddox · on Nov 21, 2021

> [...] TBH I don't see the advantage of his multi-stack proposal compared to local variables as in Factor [...]

Local variables effectively turn the language into the lambda calculus. They're equivalent in the sense of any two Turing-complete languages being "equivalent", but there are important distinctions. First, allowing the binding of values to variables breaks automatic enforcement of linearity (and of the other substructural sub-languages when intrinsics are restricted). Second, it breaks the concatentive principal and forces you to worry about variable hoisting and renaming when factoring and refactoring.

> I am also not clear what the advantage of multistacks is for side effect control [...]

It's less about control and more about ergonomics and simplicity. Multistacks will essentially give us a capabilities/effects system for free, once we have qualified types with type inference.

Twisol · on Nov 20, 2021

I'm guessing we'll see a reduction much like those for multi-head or multi-tape Turing machines to single-tape, single-head, classical Turing machines.

First, let's suppose the main ("unnamed") virtual stack has index 0, and the others ($x, $y, $z, and so on) have positive indices. Our "physical" stack will contain these virtual stacks as quotations. The main stack is at the top (index zero), and other stacks are deeper in the stack (given by their index).

Each virtual stack is represented by a series of nested quotations, so for example the stack "a b c" would be stored on the physical stack as [[[[nil] a] b] c]. We need to define `nil`; it should be something that turns a virtual stack underflow into a physical stack underflow, so let's pick:

    nil = [swap drop clone apply] clone apply

If we want to push (or pop) an element between a virtual stack and the physical stack, we need to traverse down to it; meaning we need to store all of the stacks with index less than our target temporarily. We can do this by successively quoting and composing the top two elements of the real stack, so that given a stack `a b c d` we can move `a` to the second position (and `b c d` all into the first position) as `a [b [c [d]]]`; a `swap` then makes `a` available for immediate operation, after which we `swap again` before returning to the original configuration of the physical stack.

Now, given a program to run, we can see all of the named stacks it uses. This lets us preallocate all of our virtual stacks onto the physical stack at startup. A virtualized stack operation will pop N elements from up to N virtual stacks onto the physical stack, perform some operation, and then push M elements from the physical stack onto up to M virtual stacks.

smaddox · on Nov 21, 2021

In the existing prototypes, only expression stacks are nested, not value stacks. The value multistack is flat and is modeled by a map from stack name to value stack.

Eventually, when compiling to C (and potentially to machine code), values on the different stacks will effectively be assigned to slots on the C stack, which seems similar to what you're describing.

Twisol · on Nov 21, 2021

> The value multistack is flat and is modeled by a map from stack name to value stack.

Makes sense for a real implementation, of course. It's nice to have a theoretical reduction, though, so we know there's no semantic difference between using multiple stacks and using a single stack.

smaddox · on Nov 21, 2021

I'm not sure if such a reduction is possible, in general, without explicitly modeling a map from stack names to stacks (which you could certainly do in the UCC, but it would be quite verbose). For specific programs you can always do a such reduction, of course.

Twisol · on Nov 21, 2021

Right, for sure. I think Turing reductions tend to be in terms of a concrete Turing machine, anyway; I don't think there's any claim being made here of a universal multistack machine, which is what you might be touching on.

smaddox · on Nov 21, 2021

My next post will cover this :-)

nudpiedo · on Nov 20, 2021

who is your target user?

> to democratize formal software verification in order to make it much more feasible and realistic to write perfect software

to me that sounds like only people curious by nature or propense to try new programming languages as the language seems like an unfinished typed APL dialect, but perhaps I didn't get the idea enough.

Could you add some practical examples next to the theoretical points at each step?

smaddox · on Nov 21, 2021

These toy languages are not practical for general programming. Their purpose is to test and validate the formalized foundations of Dawn.

Once we build up functionality nearing that of Dawn, it should be more feasible to provide practical examples that demonstrate the value of the language.

kragen · on Nov 21, 2021

Every language initially appeals only to people with a propensity to try new programming languages!

artemonster · on Nov 20, 2021

were you considering explicitly splitting stack and queue (similarly to http://nsl.com/k/xy/xy.htm)? in the current "flat" notation the active word to be applied is implicit

smaddox · on Nov 21, 2021

I'm not entirely sure I understand. The formal semantics hopefully speak for themselves. Ultimately, when compiling Dawn to C or machine code, we'll want to use a stack of instruction pointers rather than maintaining an explicit queue of terms to be evaluated.

atsmyles · on Nov 21, 2021

"In fact, Turing completeness can be achieved with as few as two intrinsics", are you referring to the SK combinators, or in terms of the intrinsics in your language?

smaddox · on Nov 22, 2021

Yes, essentially. Things are slightly different for the concatentive calculus, though. Traditional combinator calculus is a restriction of the lambda calculus. SK calculus does not operate on a stack.

dash2 · on Nov 20, 2021

Dawn looks very different to program in than a traditional language. Will that always be true, or can you (and do you want to!) build layers on top that are more standard-looking?

jandrewrogers · on Nov 20, 2021

I am not the author but in this case the form of the language supports the differentiated function that might make it useful to someone versus some other class of language. Sophisticated concatenative language designs can naturally express certain things that are pretty difficult and awkward to express in more conventional programming language idioms. An example, and the implementation problem that introduced me to concatenative languages, is software that does automated program induction. The downside is that these languages are difficult to optimize for conventional silicon -- you would not want to use them to build software they are not uniquely good at.

While you could build layers that express a more conventional set of language idioms, most of the value in using this language as opposed to some other will be in exploiting the unconventional expressiveness of the mechanics.

smaddox · on Nov 21, 2021

I do hope to provide a derived term or syntax for something like a for loop with support for continue and break, since it's hard to match them in terms of productivity and ergonomics. I haven't tried to prototype anything like that, though, so we'll see.

However, I don't think writing Dawn will be as unfamiliar as it might seem, especially to functional programmers.

artemonster · on Nov 20, 2021

Were you inspired by "Joy"?

smaddox · on Nov 21, 2021

Joy is definitely an influence. In my understanding, it's the first strictly concatentive language, whereas Forth has several non-concatenative terms.

AnthonBerg · on Nov 20, 2021

What question would you most like to be asked? heh :sweat-smile:

kragen · on Nov 21, 2021

How old is your cat?

johnthescott · on Nov 22, 2021

any comments on prime p or brainfuck?