I feel a rant coming on... This is a _fantastic_ article which should be read by...

pauldirac137 · on Sept 11, 2010

I've re-read the article, and now I think it's even more fantastic than I did at first. There are just so many deep insights in it, I'm in awe. Kudos to the author. I especially like this statement:

"Because you need superhuman abilities to work without layers."

The extreme Forth approach described in the article is all about doing away with all abstraction layers and optimizing everything at all layers simultaneously as much as possible, much like (brilliant analogy from the article) bacteria optimize every base pair in their DNA. But humans _don't_ optimize every base pair in their DNA. The author didn't pursue this, so I will, because it's important: humans don't do this because you couldn't evolve a system anywhere near as complex as a human _without_ having abstraction layers and the (unfortunate) lack of optimality that this entails. The layers mean that the subsystems can evolve in a reasonably independent fashion (like software engineers not needing to change the hardware), but you sacrifice ultimate optimality for that. It's a worthwhile trade-off: you get scalability. With extreme-Forth you don't, which is why Forth programs solve small-scale problems, and Forth programmers simply avoid large-scale problems (or on a bad day, whine about how those large-scale problems shouldn't exist, yadda yadda). Again, I point to my web browser problem; you may think that CSS is completely unnecessary, but it's a layer that makes life easier for people who do not have the time or expertise to optimize software or hardware, so it's part of the problem domain, deal with it! Gimme my abstraction layers!

Seriously, I've seen Chuck Moore quoted as saying that we shouldn't name constants but use raw (magic) numbers instead, because by naming them we throw away valuable information that could be used for optimization. That is serious layer-hatred going on there.

I personally am interested in stack languages for a completely different reason: stacks allow you to compose functions in a way that is much more cumbersome without a stack, and I think that's neat. Not world-shaking, but neat. I like to have the option to use local variables if that's the natural way to decompose the problem (an expression graph, as the article describes) and use the stack if _that's_ the natural way to decompose the problem (expression trees). Why have only one tool in your toolbox when you can have two?

jacquesm · on Sept 11, 2010

Hey Paul, thank you for those two super comments. I have some experience with stack languages as well, wrote my own little forth a long time ago and just posted a comment which (while much shorter than yours) tries to defend forths 'raison d'etre' as being limited to embedded systems, bootstrapping new hardware and pretty much nothing else.

The forth zealots are running out of steam very quickly when the programs and data structures get more complex, I've yet to see a program in forth longer than a few hundred lines (though I'm sure they exist).

If anybody ever wrote a bookkeeping system in Forth then I'm not aware of it. (and if they did it probably deals with whole dollar values only and is limited to a certain turnover).

Once long ago I worked on a project that contained the 'NOVIX', then brand spanking new and top of the line, a high level language directly executed on a chip.

The project started off really well, it contained a PC component and an embedded component, the other programmer was to write some image processing software for the NOVIX chip and it would output a high level representation of the image sensor data through the bus to the 386 which would do the presentation.

In the end I wrote the image processing code as well simply because the problem was too hard to solve in forth in a reasonable time, so we would simply pump raw image data across the bus (fortunately we had enough bandwidth for that).

The time to write the image processing component in C was exactly a single day.

Forth is at the other end of expressiveness compared to most languages, if the language matches your problem well it's amazingly compact and short, but if there is an impedance mismatch between the language and the problem at hand then you may very well end up in trouble you can not get out of. There are only so many DUPs and ROT3s you can see before your eyes glaze over, and even if local variables help they are not going to solve the underlying problem that Forth tends to force the programmer and the problem to adapt to Forth rather than the other way around.

Edit: Another thing I really like about Forth besides it being such a tiny core is the fact that any program in Forth is essentially a DSL.

If someone could remove the roadblocks that stop Forth/Factor projects from going above a certain size then that might ignite a more mainstream interest in to stack based languages.

silentbicycle · on Sept 11, 2010

Forth-like systems do get used for larger projects - we just call them stack-based virtual machines, and compile to their bytecode rather than writing it by hand (and letting the compiler handle the tricky dataflow issues, eliminate superfluous stack manipulations, etc.).

Forth makes sense to me in embedded (or otherwise tightly-constrained) systems, but otherwise, you're spending mental energy on being a human compiler.

I agree that the concatenative languages are interesting, but the APL family is probably a better example. (Joy is also cool, but much less practical.)

jacquesm · on Sept 11, 2010

That's pretty much the way the article ends, so yes, sure. But I meant as in 'written directly in Forth'.

thyrsus · on Sept 12, 2010

The book "Thinking Forth" contains a quote from Charles Moore: "In my accounting system [...]"; the example shows two significant places after the decimal point for the dollar amount - whether the type is floating point or fixed decimal is naturally implicit (and uninferable from the example).

Nothing in that vignette contradicts the observations of the article.

kragen · on Sept 12, 2010

I'm fairly sure it was fixed-point. Floating-point accounting is widely considered a bad idea, even in places that are otherwise positive about floating-point.

cageface · on Sept 11, 2010

There is something very intellectually satisfying about languages that derive all their power from just a handful of small, orthogonal concepts. The elegance of languages like Scheme and Forth is seductive, and it seems reasonable to assume that an elegant language leads to elegant solutions.

This doesn't play out as well in practice though. Most of the time when programming we're working at abstraction levels far, far above those established by the language kernel and it's the tractability of the language at those levels that matters. It's charming that you can implement your own object system in a few lines of Scheme but what's much more valuable in most cases is a single, standardized object system used consistently throughout a codebase and its dependencies. This is why we program today in Java and Python and not Forth.

silentbicycle · on Sept 11, 2010

Ironically, though, Java is often compiled to a stack-based virtual machine. "A number of instructions are provided for the direct manipulation of the operand stack: pop, pop2, dup, dup2, dup_x1, dup2_x1, dup_x2, dup2_x2, swap." (http://java.sun.com/docs/books/jvms/second_edition/html/Over...) CPython uses stack-based bytecode, too (http://docs.python.org/release/2.5.2/lib/bytecodes.html - POP_TOP, ROT_THREE, etc.), and Lua's C API also uses a stack interface (though its VM is register-based now).

Those stack manipulations still creep in, they're just not written by hand anymore.

It makes more sense to me to think of Forth as a kind of virtual machine. Writing its bytecode by hand seems like a bad trade-off, except in an extremely limited environment.

zephyrfalcon · on Sept 11, 2010

Hmm, so, in theory (and maybe in practice), instead of writing a new stack-based virtual machine, one could use Forth instead, and write a compiler (for a given language) that targets it. I wonder if this has actually been done?

silentbicycle · on Sept 11, 2010

VMs usually have a fixed set of instructions, because the compiler needs to know how to cleanly map the language semantics to them. It's just that most stack machine bytecode tends to overlap with a subset of Forth.

cma · on Sept 12, 2010

We did exactly that in our compilers class at school.

_delirium · on Sept 11, 2010

> Here's a challenge: I would really love to see someone write a full-featured web browser (at the level of a Firefox, Chrome or Safari) in Forth. You can do that in C, C++, Java, Python, and probably in dozens of other languages I could name. [...] I think that any web browser that anyone attempted to write in Forth (without local variables, naturally!) would never get finished.

Do those languages actually pass the test, though? I see the point, but I think this particular challenge problem is a bit too hard in terms of size of teams and resources needed--- hard enough that as a descriptive matter, Java, Python, and every other language that isn't C/C++ has actually failed the test. Many people have tried, and all of those browsers have, well, never gotten finished. People have been writing a pure-Java modern web browser since at least 1997, when Netscape decided to rewrite Navigator in Java (in a project, "Javagator", that was never finished). Along the way, there've been a half-dozen other half-finished Java browsers (HotJava, Lobo, etc.). There's a half-finished Common Lisp browser, too (Closure). But nothing outside of C/C++ competes with Webkit, Opera, IE, and Firefox.

I do agree that Forth isn't a great language for writing to complex predefined specifications. I like its general ideas, though, and the principle of deciding to solve a simpler problem rather than a harder one actually seems fine to me, if you really are in a position where you have the choice. Not that I'm big on Forth, personally, but it's somewhat philosophically consonant with my own preferences, which are in the suckless.org vein.

pauldirac137 · on Sept 11, 2010

You're right, I'm being extremely charitable to Java and Python. As you say, there was a Java browser (HotJava) in the early days of browsers, and there was a Python browser once called Grail, but neither one was even close to Firefox/Chrome/Safari. I think you _could_ write a Firefox clone in Java, and in Python too (though a Python Firefox clone would be perhaps too slow to use). My point was that I don't think you could even write such a beast in Forth (or, more accurately, in Forth using the extreme-Forth approach described in the article; maybe it would be possible in ANS Forth). You need the abstraction layers or the problem is unmanageable.

benreesman · on Sept 11, 2010

i know for a fact that CSS 2.1 layout and ECMA-262 (modulo, "not much") scripting (on a stack machine, with an instruction for '==' and a different one for '==='!) can be done with the same algorithms implemented in java and c++. reading the specs is a good way to get an ulcer, though. :)

zephyrfalcon · on Sept 11, 2010

A while ago I stumbled upon a stack-based language called V: http://code.google.com/p/v-language/

On the very first page is V's version of the quadratic equation problem:

  [quad-formula
      [a b c] let
      [minisub 0 b -].
      [radical b b * 4 a * c * - sqrt].
      [divisor 2 a *].
      [root1 minisub radical + divisor /].
      [root2 minisub radical - divisor /].
      root1 root2
  ].

  |2 4 -30 quad-formula ??
  =(-5.0 3.0)

From what I understand, V encourages the use of local variables; it is also more of a high-level language, like Joy, as opposed to being close to the "bare metal" like Forth.

kragen · on Sept 12, 2010

This may be lower-level, with all the explicit type conversions, but it doesn't look much more complicated to me. Maybe I'm fooling myself?

    variable a  variable b  variable c  fvariable discriminant
    : b^2 b @ dup * ;       : 4ac 4 a @ c @ * * ;   : s>f s>d d>f ;
    : -b  b @ negate s>f ;  : /2a  a @ 2* s>f f/  ; : f. f>d d. ;
    : quadeq c ! b ! a !    b^2 4ac - s>f fsqrt  discriminant f!
        -b  discriminant f@ f+  /2a
        -b  discriminant f@ f-  /2a ;

Usage:

    : kragen@inexorable:~/devel/inexorable-misc ; gforth quadeq.fs
    Gforth 0.6.2, Copyright (C) 1995-2003 Free Software Foundation, Inc.
    Gforth comes with ABSOLUTELY NO WARRANTY; for details type `license'
    Type `bye' to exit
    2 4 -30 quadeq f. f. -5. 3.  ok

I'd dispense with the floating-point nonsense altogether, but ANS Forth, like C, only defines a square-root routine in floating-point. It's a little more code to define in Forth than in C:

    variable n  : improve  n @ over / + 2/ ;  : far?  - abs 1 > ;
    : sqrt  dup n !  dup if  begin  dup improve  swap over far? while  repeat  then
      dup  dup * n @ > if  1-  then  ;

But that hardly seems like a strong recommendation.

Edit: someone else's cute Newton's-method square-root is at http://jasonwoof.org/sqrt.fs --- it looks quite a bit simpler!

kragen · on Sept 13, 2010

On further inspection, the jasonwoof code is quite a bit simpler not only because it has a better approach (which it does!) but also because it leaves out the checks for 0 and off-by-one. So it crashes on 0 and gives the wrong answer for 6, saying 3 when the answer correct to two places is 2.45. You can correct the wrong answers with the same dup dup * n @ > if 1- then business that I was using, assuming you want floor(sqrt(n)) instead of, say, round(sqrt(n)) or ceil(sqrt(n)).

thesz · on Sept 11, 2010

I should note that stack manipulation gets easier when you assign types for values on stack. Combined with algebraic types from functional languages, it allows one to express very interesting programs in quite Forth way without breaking security of strongly typed languages.

For example, it is customary for Forth words to return value under flag: open-file ( String -- Sandle true | false ). Ie, if we then doing IF, we get "handle" on top of stack in THEN part and nothing on ELSE part. We cannot DUP or OVER those values using regular DUP or OVER. But open-file can be typed differently: open-file ( String -- Maybe Handle) and then we should scrutinize value in Maybe type (defined as data Maybe type = Nothing | Just type) using special onmaybe control word, much like if. DUP or OVER can require that their stack picture should contain only single word values and they won't operate with Maybe because Maybe isn't single-word.

I took an attempt to implement that in Haskell: http://thesz.mskhug.ru/svn/misc/TypedStack.hs

kragen · on Sept 12, 2010

Here's my Forth version of the quadratic equation from elsewhere in this thread; it ends up being six lines instead of three, but I don't think that's excessive:

    variable a  variable b  variable c  fvariable discriminant
    : b^2 b @ dup * ;       : 4ac 4 a @ c @ * * ;   : s>f s>d d>f ;
    : -b  b @ negate s>f ;  : /2a  a @ 2* s>f f/  ; : f. f>d d. ;
    : quadeq c ! b ! a !    b^2 4ac - s>f fsqrt  discriminant f!
        -b  discriminant f@ f+  /2a
        -b  discriminant f@ f-  /2a ;

I agree that the Python version is a lot clearer and less bug-prone, but I don't agree that this is extremely difficult to do without local variables. Forth-style global variables work fine, because the function isn't recursive. They're just bug-prone, that's all.

But they're much less bug-prone in Forth than in, say, C. I'm too sleepy to write in any detail about the aspects of Forth that lead me to this conclusion right now; they have to do with what Forth does with multiple definitions with the same name, with wordlists, and with layout. Suffice it to say that I'm no Forth advocate, but I do find it fascinating.

dedward · on Sept 11, 2010

I'd be curious how long it would take you to port the python stack to run that code -vs- how loing it would take for someone to implement FORTH on a brand-new platform with a new instruction set (and remember, if you want to use a C compiler, you'll have to write or port that too.)