felixr's favorites | Hacker News

A fun and fairly simple project, with a surprisingly high ratio of usefullness to effort, is to write an interpreter for a concatenative language. Concatenative languages, like FORTH, can do a lot with very limited resources, making them good candidates for embedded systems.

If you want to play around with making your own concatenative language, it is actually surprisingly simple. Here is an overview of a step-by-step approach that can take you from a simple calculator to a full language with some optimization that would actually be quite reasonable to use in an embedded system.

So let's start with the calculator. We are going to have a data stack, and all operations will operate on the stack. We make a "dictionary" whose entries are "words" (basically names of functions). For each word in the dictionary, the dictionary contains a pointer to the function implementing that word.

We'll need six functions for the calculator: add, sub, mul, div, clr, and print. The words for these will be "+", "-", "x", "/", "clr", and "print". So our dictionary looks like this in C:

    struct DictEntry {
        char * word;
        int (*func)(void);
    } dict[6] = {
        {"+", add}, {"-", sub}, {"x", mul},
        {"/", div}, {"clr", clr}, {"print", print}
    };

We need a main loop, which will be something like this (pseudocode):

    while true
        token = NextToken()
        if token is in dictionary
            call function from that dict entry
        else if token is a number
            push that number onto the data stack

Write NextToken, making it read from your terminal and parse into whitespace separated strings, implement add, sub, mul, div, clr, and print, with print printing the top item on the data stack on your terminal, and you've got yourself an RPN calculator. Type "2 3 + 4 5 + x print" and you'll get 45.

OK, that's fine, but we want something we can program. To get to that, we first extend the dictionary a bit. We add a flag to each entry allowing us to mark the entry as either a C code entry or an interpreted entry, and we add a pointer to an array of integers, and we add a count telling the length of that array of integers. When an entry is marked as C code, it means that the function implementing it is written in C, and the "func" field in the dictionary points to the implementing function. When an entry is marked as interpreted, it means that the pointer to an array of integers points to a list of dictionary offsets, and the function is implemented by invoking the functions of the referenced dictionary entries, in order.

A dictionary entry now looks something like this:

    struct DictEntry {
        char * word;
        bool c_flag;
        void (*func)(void);
        int * def;
        int deflen;
    }

(continued in reply)