Yosefk - My history with Forth & stack machines

impeachgod · on Sept 11, 2010

I've always been fascinated with Chuck Moore's VLSI design tools. How does he manage to design working VLSI chips with such limited functionality? Does he do complex simulation by hand? Wouldn't it get hard laying out each individual transistor?

thesz · on Sept 11, 2010

I think he use combinatoric approach.

Yuo can describe a gate using transistors. Then you can describe a circiuit using gates and so on.

This is wildly popular in functional programming, and in Haskell in particular.

Take a look at those two libraries: Lava: http://hackage.haskell.org/package/chalmers-lava2000 and http://hackage.haskell.org/package/york-lava

and Wired: http://hackage.haskell.org/package/Wired

Wired is especially interesting because it let you simultaneously describe a circuit and its layout.

pauldirac137 · on Sept 11, 2010

I feel a rant coming on...

This is a _fantastic_ article which should be read by anyone interested in stack languages.

Though it wasn't my first programming language, Forth was certainly the first programming language I loved. I read Starting Forth and was blown away by the elegance of the language and by how easily you could get the language to modify itself. It also started a fascination with stack languages that persists to this day.

However, when I tried to actually do anything useful with Forth, I kept banging my head against a wall. The typical problem is that stack juggling is just too hard for most problems, and even when it isn't, it's more work than just writing down an algebraic expression would be. Stacks work well for some problems, but evaluating arbitrary algebraic expressions isn't one. Why would I want to do that? asks the Forth zealot. Because I do! Because I don't want to have to worry that I won't be able to do that easily in the future when I do need to. So even though I've designed my own Forths and other stack languages since then, rule #1 is always: support local variables. Even if I don't need them, I want them. If you don't believe me, translate this problem into Forth or some other stack language in a clear, intuitive way without using local variables:

    # python code to compute solutions of a**2*x + b*x + c == 0:
    def solve_quadratic_equation(a, b, c):
        # assume a >=0, no complex roots
        d = math.sqrt(b * b - 4.0 * a * c)
        return ((-b - d) / (2.0 * a), (-b + d) / (2.0 * a))

Three lines of code, completely transparent. Now see this thread: http://osdir.com/ml/lang.concatenative/2005-01/msg00023.html. You'll see how difficult this is to do without local variables in a stack language. The reason is easy to spot: you are using values in a pretty random-access way, not in a streamed way. This is common in algebraic equations. And it's not stack-friendly.

Another thing I hate about pure Forth is that floating-point values, if they exist at all (they often don't) are put on a different stack, which means you have to duplicate all the operators with different names (I call this "operator underloading") and this can lead to hilariously obscure bugs because the two stacks are independent.

Basically, the attitude that the hard-core Forthers have reminds me of analogous statements that have been made about communism, or moonieism or whatever extreme -ism is promoted as the cure to all problems: if you are having problems, it's because you haven't gone far enough down the road yet. Only by committing 1000% to the approach will you ever get anything done. This is pure cult behavior.

And yet: Forth does have a valid and useful niche, in embedded systems, and it is a fascinating language, one well worth studying. But I don't think it's a coincidence that none of the software that people use everyday (other than very low-level stuff like e.g. the boot proms) on their computers is written in Forth.

Here's a challenge: I would really love to see someone write a full-featured web browser (at the level of a Firefox, Chrome or Safari) in Forth. You can do that in C, C++, Java, Python, and probably in dozens of other languages I could name. The beauty of this problem is that you can't go and bitch that you shouldn't have to support something like (say) CSS or embedded video because you think it's unnecessary; it's part of the problem domain and you don't get to complain about it. I think that any web browser that anyone attempted to write in Forth (without local variables, naturally!) would never get finished. But maybe I'm wrong, and I just haven't reached Forthlightenment yet. If so, prove me wrong.

pauldirac137 · on Sept 11, 2010

I've re-read the article, and now I think it's even more fantastic than I did at first. There are just so many deep insights in it, I'm in awe. Kudos to the author. I especially like this statement:

"Because you need superhuman abilities to work without layers."

The extreme Forth approach described in the article is all about doing away with all abstraction layers and optimizing everything at all layers simultaneously as much as possible, much like (brilliant analogy from the article) bacteria optimize every base pair in their DNA. But humans _don't_ optimize every base pair in their DNA. The author didn't pursue this, so I will, because it's important: humans don't do this because you couldn't evolve a system anywhere near as complex as a human _without_ having abstraction layers and the (unfortunate) lack of optimality that this entails. The layers mean that the subsystems can evolve in a reasonably independent fashion (like software engineers not needing to change the hardware), but you sacrifice ultimate optimality for that. It's a worthwhile trade-off: you get scalability. With extreme-Forth you don't, which is why Forth programs solve small-scale problems, and Forth programmers simply avoid large-scale problems (or on a bad day, whine about how those large-scale problems shouldn't exist, yadda yadda). Again, I point to my web browser problem; you may think that CSS is completely unnecessary, but it's a layer that makes life easier for people who do not have the time or expertise to optimize software or hardware, so it's part of the problem domain, deal with it! Gimme my abstraction layers!

Seriously, I've seen Chuck Moore quoted as saying that we shouldn't name constants but use raw (magic) numbers instead, because by naming them we throw away valuable information that could be used for optimization. That is serious layer-hatred going on there.

I personally am interested in stack languages for a completely different reason: stacks allow you to compose functions in a way that is much more cumbersome without a stack, and I think that's neat. Not world-shaking, but neat. I like to have the option to use local variables if that's the natural way to decompose the problem (an expression graph, as the article describes) and use the stack if _that's_ the natural way to decompose the problem (expression trees). Why have only one tool in your toolbox when you can have two?

jacquesm · on Sept 11, 2010

Hey Paul, thank you for those two super comments. I have some experience with stack languages as well, wrote my own little forth a long time ago and just posted a comment which (while much shorter than yours) tries to defend forths 'raison d'etre' as being limited to embedded systems, bootstrapping new hardware and pretty much nothing else.

The forth zealots are running out of steam very quickly when the programs and data structures get more complex, I've yet to see a program in forth longer than a few hundred lines (though I'm sure they exist).

If anybody ever wrote a bookkeeping system in Forth then I'm not aware of it. (and if they did it probably deals with whole dollar values only and is limited to a certain turnover).

Once long ago I worked on a project that contained the 'NOVIX', then brand spanking new and top of the line, a high level language directly executed on a chip.

The project started off really well, it contained a PC component and an embedded component, the other programmer was to write some image processing software for the NOVIX chip and it would output a high level representation of the image sensor data through the bus to the 386 which would do the presentation.

In the end I wrote the image processing code as well simply because the problem was too hard to solve in forth in a reasonable time, so we would simply pump raw image data across the bus (fortunately we had enough bandwidth for that).

The time to write the image processing component in C was exactly a single day.

Forth is at the other end of expressiveness compared to most languages, if the language matches your problem well it's amazingly compact and short, but if there is an impedance mismatch between the language and the problem at hand then you may very well end up in trouble you can not get out of. There are only so many DUPs and ROT3s you can see before your eyes glaze over, and even if local variables help they are not going to solve the underlying problem that Forth tends to force the programmer and the problem to adapt to Forth rather than the other way around.

Edit: Another thing I really like about Forth besides it being such a tiny core is the fact that any program in Forth is essentially a DSL.

If someone could remove the roadblocks that stop Forth/Factor projects from going above a certain size then that might ignite a more mainstream interest in to stack based languages.

silentbicycle · on Sept 11, 2010

Forth-like systems do get used for larger projects - we just call them stack-based virtual machines, and compile to their bytecode rather than writing it by hand (and letting the compiler handle the tricky dataflow issues, eliminate superfluous stack manipulations, etc.).

Forth makes sense to me in embedded (or otherwise tightly-constrained) systems, but otherwise, you're spending mental energy on being a human compiler.

I agree that the concatenative languages are interesting, but the APL family is probably a better example. (Joy is also cool, but much less practical.)

jacquesm · on Sept 11, 2010

That's pretty much the way the article ends, so yes, sure. But I meant as in 'written directly in Forth'.

thyrsus · on Sept 12, 2010

The book "Thinking Forth" contains a quote from Charles Moore: "In my accounting system [...]"; the example shows two significant places after the decimal point for the dollar amount - whether the type is floating point or fixed decimal is naturally implicit (and uninferable from the example).

Nothing in that vignette contradicts the observations of the article.

kragen · on Sept 12, 2010

I'm fairly sure it was fixed-point. Floating-point accounting is widely considered a bad idea, even in places that are otherwise positive about floating-point.

cageface · on Sept 11, 2010

There is something very intellectually satisfying about languages that derive all their power from just a handful of small, orthogonal concepts. The elegance of languages like Scheme and Forth is seductive, and it seems reasonable to assume that an elegant language leads to elegant solutions.

This doesn't play out as well in practice though. Most of the time when programming we're working at abstraction levels far, far above those established by the language kernel and it's the tractability of the language at those levels that matters. It's charming that you can implement your own object system in a few lines of Scheme but what's much more valuable in most cases is a single, standardized object system used consistently throughout a codebase and its dependencies. This is why we program today in Java and Python and not Forth.

silentbicycle · on Sept 11, 2010

Ironically, though, Java is often compiled to a stack-based virtual machine. "A number of instructions are provided for the direct manipulation of the operand stack: pop, pop2, dup, dup2, dup_x1, dup2_x1, dup_x2, dup2_x2, swap." (http://java.sun.com/docs/books/jvms/second_edition/html/Over...) CPython uses stack-based bytecode, too (http://docs.python.org/release/2.5.2/lib/bytecodes.html - POP_TOP, ROT_THREE, etc.), and Lua's C API also uses a stack interface (though its VM is register-based now).

Those stack manipulations still creep in, they're just not written by hand anymore.

It makes more sense to me to think of Forth as a kind of virtual machine. Writing its bytecode by hand seems like a bad trade-off, except in an extremely limited environment.

zephyrfalcon · on Sept 11, 2010

Hmm, so, in theory (and maybe in practice), instead of writing a new stack-based virtual machine, one could use Forth instead, and write a compiler (for a given language) that targets it. I wonder if this has actually been done?

silentbicycle · on Sept 11, 2010

VMs usually have a fixed set of instructions, because the compiler needs to know how to cleanly map the language semantics to them. It's just that most stack machine bytecode tends to overlap with a subset of Forth.

cma · on Sept 12, 2010

We did exactly that in our compilers class at school.

_delirium · on Sept 11, 2010

> Here's a challenge: I would really love to see someone write a full-featured web browser (at the level of a Firefox, Chrome or Safari) in Forth. You can do that in C, C++, Java, Python, and probably in dozens of other languages I could name. [...] I think that any web browser that anyone attempted to write in Forth (without local variables, naturally!) would never get finished.

Do those languages actually pass the test, though? I see the point, but I think this particular challenge problem is a bit too hard in terms of size of teams and resources needed--- hard enough that as a descriptive matter, Java, Python, and every other language that isn't C/C++ has actually failed the test. Many people have tried, and all of those browsers have, well, never gotten finished. People have been writing a pure-Java modern web browser since at least 1997, when Netscape decided to rewrite Navigator in Java (in a project, "Javagator", that was never finished). Along the way, there've been a half-dozen other half-finished Java browsers (HotJava, Lobo, etc.). There's a half-finished Common Lisp browser, too (Closure). But nothing outside of C/C++ competes with Webkit, Opera, IE, and Firefox.

I do agree that Forth isn't a great language for writing to complex predefined specifications. I like its general ideas, though, and the principle of deciding to solve a simpler problem rather than a harder one actually seems fine to me, if you really are in a position where you have the choice. Not that I'm big on Forth, personally, but it's somewhat philosophically consonant with my own preferences, which are in the suckless.org vein.

pauldirac137 · on Sept 11, 2010

You're right, I'm being extremely charitable to Java and Python. As you say, there was a Java browser (HotJava) in the early days of browsers, and there was a Python browser once called Grail, but neither one was even close to Firefox/Chrome/Safari. I think you _could_ write a Firefox clone in Java, and in Python too (though a Python Firefox clone would be perhaps too slow to use). My point was that I don't think you could even write such a beast in Forth (or, more accurately, in Forth using the extreme-Forth approach described in the article; maybe it would be possible in ANS Forth). You need the abstraction layers or the problem is unmanageable.

benreesman · on Sept 11, 2010

i know for a fact that CSS 2.1 layout and ECMA-262 (modulo, "not much") scripting (on a stack machine, with an instruction for '==' and a different one for '==='!) can be done with the same algorithms implemented in java and c++. reading the specs is a good way to get an ulcer, though. :)

zephyrfalcon · on Sept 11, 2010

A while ago I stumbled upon a stack-based language called V: http://code.google.com/p/v-language/

On the very first page is V's version of the quadratic equation problem:

  [quad-formula
      [a b c] let
      [minisub 0 b -].
      [radical b b * 4 a * c * - sqrt].
      [divisor 2 a *].
      [root1 minisub radical + divisor /].
      [root2 minisub radical - divisor /].
      root1 root2
  ].

  |2 4 -30 quad-formula ??
  =(-5.0 3.0)

From what I understand, V encourages the use of local variables; it is also more of a high-level language, like Joy, as opposed to being close to the "bare metal" like Forth.

kragen · on Sept 12, 2010

This may be lower-level, with all the explicit type conversions, but it doesn't look much more complicated to me. Maybe I'm fooling myself?

    variable a  variable b  variable c  fvariable discriminant
    : b^2 b @ dup * ;       : 4ac 4 a @ c @ * * ;   : s>f s>d d>f ;
    : -b  b @ negate s>f ;  : /2a  a @ 2* s>f f/  ; : f. f>d d. ;
    : quadeq c ! b ! a !    b^2 4ac - s>f fsqrt  discriminant f!
        -b  discriminant f@ f+  /2a
        -b  discriminant f@ f-  /2a ;

Usage:

    : kragen@inexorable:~/devel/inexorable-misc ; gforth quadeq.fs
    Gforth 0.6.2, Copyright (C) 1995-2003 Free Software Foundation, Inc.
    Gforth comes with ABSOLUTELY NO WARRANTY; for details type `license'
    Type `bye' to exit
    2 4 -30 quadeq f. f. -5. 3.  ok

I'd dispense with the floating-point nonsense altogether, but ANS Forth, like C, only defines a square-root routine in floating-point. It's a little more code to define in Forth than in C:

    variable n  : improve  n @ over / + 2/ ;  : far?  - abs 1 > ;
    : sqrt  dup n !  dup if  begin  dup improve  swap over far? while  repeat  then
      dup  dup * n @ > if  1-  then  ;

But that hardly seems like a strong recommendation.

Edit: someone else's cute Newton's-method square-root is at http://jasonwoof.org/sqrt.fs --- it looks quite a bit simpler!

kragen · on Sept 13, 2010

On further inspection, the jasonwoof code is quite a bit simpler not only because it has a better approach (which it does!) but also because it leaves out the checks for 0 and off-by-one. So it crashes on 0 and gives the wrong answer for 6, saying 3 when the answer correct to two places is 2.45. You can correct the wrong answers with the same dup dup * n @ > if 1- then business that I was using, assuming you want floor(sqrt(n)) instead of, say, round(sqrt(n)) or ceil(sqrt(n)).

thesz · on Sept 11, 2010

I should note that stack manipulation gets easier when you assign types for values on stack. Combined with algebraic types from functional languages, it allows one to express very interesting programs in quite Forth way without breaking security of strongly typed languages.

For example, it is customary for Forth words to return value under flag: open-file ( String -- Sandle true | false ). Ie, if we then doing IF, we get "handle" on top of stack in THEN part and nothing on ELSE part. We cannot DUP or OVER those values using regular DUP or OVER. But open-file can be typed differently: open-file ( String -- Maybe Handle) and then we should scrutinize value in Maybe type (defined as data Maybe type = Nothing | Just type) using special onmaybe control word, much like if. DUP or OVER can require that their stack picture should contain only single word values and they won't operate with Maybe because Maybe isn't single-word.

I took an attempt to implement that in Haskell: http://thesz.mskhug.ru/svn/misc/TypedStack.hs

kragen · on Sept 12, 2010

Here's my Forth version of the quadratic equation from elsewhere in this thread; it ends up being six lines instead of three, but I don't think that's excessive:

    variable a  variable b  variable c  fvariable discriminant
    : b^2 b @ dup * ;       : 4ac 4 a @ c @ * * ;   : s>f s>d d>f ;
    : -b  b @ negate s>f ;  : /2a  a @ 2* s>f f/  ; : f. f>d d. ;
    : quadeq c ! b ! a !    b^2 4ac - s>f fsqrt  discriminant f!
        -b  discriminant f@ f+  /2a
        -b  discriminant f@ f-  /2a ;

I agree that the Python version is a lot clearer and less bug-prone, but I don't agree that this is extremely difficult to do without local variables. Forth-style global variables work fine, because the function isn't recursive. They're just bug-prone, that's all.

But they're much less bug-prone in Forth than in, say, C. I'm too sleepy to write in any detail about the aspects of Forth that lead me to this conclusion right now; they have to do with what Forth does with multiple definitions with the same name, with wordlists, and with layout. Suffice it to say that I'm no Forth advocate, but I do find it fascinating.

dedward · on Sept 11, 2010

I'd be curious how long it would take you to port the python stack to run that code -vs- how loing it would take for someone to implement FORTH on a brand-new platform with a new instruction set (and remember, if you want to use a C compiler, you'll have to write or port that too.)

zeteo · on Sept 10, 2010

"Astronomically costly licenses. Geological run times."

Nice writing.

solutionyogi · on Sept 10, 2010

You should definitely read his other articles. He is a talented writer who knows his subject matter as well. I will never forget this paragraph,

"Me. I’m a professional programmer. By which I mean to say, I shovel through piles of virtual shit for a living. And to shovel quickly and happily, I need a big shovel. Python is one of my shovels. Core dumps are one of my piles of shit. Python, meet the core dumps. Core dumps, meet the Python."

"A person who was exposed to machines and doesn’t hate them is either an idiot or is completely devoid of soul! Step back, the child of Satan!"

From: http://www.yosefk.com/blog/python-teaching-kids-and-biting-b...

[I sometimes refer myself as Janitor because I really do clean virtual shit. I wish I had registered http://janitorprogrammer.com/ ]

_delirium · on Sept 11, 2010

His amusingly written and insightful take on redundancy versus dependencies is one of my favorites, a good take imo on a fairly key issue in programming: http://www.yosefk.com/blog/redundancy-vs-dependencies-which-...

"But you already got it - I don’t want your code, because I’m an antisocial asshole that has no team spirit whatsoever. I’m going to parse arguments using 5 lines of C code. Worse, I’ll make a function out of those five lines, prefix its name with the module name, and replicate it in all my modules."

"I don’t want to depend on anything without an owner. [...] not only do I want to depend on stuff with an owner, but I require a happy owner at that. Contrary to a common managerial assumption (one of those which rarely hold but do keep managers sane), I don’t believe in forcibly assigning ownership. If the owner doesn’t like the module, expect some pretty lousy gardening job."

plinkplonk · on Sept 11, 2010

"You should definitely read his other articles. He is a talented writer who knows his subject matter as well."

His take on Extreme Programming is interesting.

http://www.yosefk.com/blog/extreme-programming-explained.htm...

primodemus · on Sept 11, 2010

"Code is more like poetry: change this line, and now the next line doesn’t rhyme, or you’ve broken the rhythm, or you’ve put angry words into a happy poem, that sort of trouble. Which is one reason to like code ownership." Amazing.

ams6110 · on Sept 12, 2010

Wow. Never seen this blog before. I've spent some time reading his blog today, and nearly everything he writes resonates with me. Thanks for the suggestion.

jacquesm · on Sept 10, 2010

Another bit: "This nip/tuck business in the code? Rather than a reference to the drama series on plastic surgery, these are names of Forth stack manipulation words."

jacquesm · on Sept 10, 2010

If you liked this look in to 'factor'.

thristian · on Sept 11, 2010

Yeah, I was (metaphorically) waving my hand in the air and yelling "Factor! Factor!" for the first half of the essay, but when I got to the end I wasn't so sure. Factor solves a lot of the problems the author describes - it has an extensive standard library, and garbage collection, and all kinds of other useful things - but I think the core complaint about writing Forth code still stands: You still have a data stack, stack-shuffling words like 'dup' and 'over' and 'rot' still make for ugly, hard-to-read code, and re-thinking your expression-graph to be expressible without such shuffling is still very hard.

I still have a local checkout of Factor in my home directory, and I really would like to get around to playing with it some more, someday - but (at least to begin with) playing with Factor sometimes feels very much like hard work.

gecko · on Sept 11, 2010

I don't honestly think that Factor suffers from the same problems enumerated here at the end of the day. I used to dally in Forth in the form of Mops and PowerMops, and my problems were basically the same as those enumerated in this essay. I've tried to get into Factor multiple times; this past time I succeeded, and I have been writing increasingly large amounts of code in my spare time that seem to flow okay. I might even start releasing some of it soon. I think the differences can be chalked up to a few major things:

Real locals. Factor locals are not penalized; they perform the same as a data stack. And while another comment correctly notes that Factor programmers prefer to avoid locals, I'd point out that locals are also used in many places in the standard library. Factor's preference is not the same as Forth's near-insistence.

Higher-level combinators via lambdas. Part of why they have that attitude is that Factor's higher-level combinators are very natural if you're coming from a functional language. A lot of its combinators are things like "run these pile of lambdas against this one object" (kind of a reverse map) or "append all of these lambdas with this extra operation or datum" (think of currying). These are VERY different in practice than dup/swap/rot. I still have to pause and think a lot while I'm coding about how and which to use, but it's getting better. The main hurdle is simply not forgetting anything.

Rich data types. Being able to have a single element on the stack that is an array, or a class, or an expandable vector, or a hashtable, or what have you, GREATLY simplifies things compared to Forth, where I'd have to "just know" that the two top things on the stack entering my function were a pointer to an array and its length or some such.

Much better error reporting. With richer data types comes much better error messages. You didn't just read random crap from memory because you read the stack effect in the wrong order; you tried to call + on a hashtable. Combined with static stack effect checking, and I find that debugging my Factor code is usually about logic, whereas at least half the time, my Forth code was about getting the stack effects right.

----

I'm not saying Factor's perfect. It's not. I use more locals than the core team does in my code, and I'm not currently convinced that's wrong in any sense. But I also get a lot of mileage out of the higher-level combinators, to the point that I can write concatenative code without feeling like I did in Forth that I'm doing the processor's work for it. It feels a lot closer to any high-level functional language, where I'm just composing functions, not jiggling the stack.

Avshalom · on Sept 11, 2010

I use more locals than the core team does in my code, and I'm not currently convinced that's wrong in any sense.

I think that the big "win" in Factor vs say CL is implicit argument passing.

from the article even:

In order to have really small definitions, you do need a stack, I guess - or some other implicit way of passing parameters around;

As far as I'm concerned as long as locals don't get in the way of that they're groovy.

silentbicycle · on Sept 11, 2010

You can get the same implicit argument passing in J (called "tacit programming") and Haskell & ML (via currying, called "points-free style"). It's optional in those languages, though, which strikes me as a good idea: sometimes it's a vast improvement, but sometimes it makes the code hard to follow for no real gain.

jacquesm · on Sept 11, 2010

What's the largest program you have written in Factor to date and what does it do ?

gecko · on Sept 11, 2010

The largest thing I wrote was a clone of the parts of "curl" that I use, modifying it slightly so that I can just write JSON to describe what I want to post to a website, rather than the URL-encoded string. This included supporting DELETE, GET, POST, and PUT, and allowing downloading the response only, headers only, or both. I want to add a progress bar and a (admittedly totally unnecessary) GUI, then I'll throw it on GitHub and get feedback on my coding style. I'm enjoying seeing how far I can figure things out without outside help right now, though.

The second largest thing I've written, which I did in a much earlier version of Factor, was to clone enough of Fog Creek Copilot's Reflector that I could use that for testing/developing the Mac versions of the product. That was buggy, disgusting code with piles of stack manipulation, though, and is part of why I quit working in Factor at the time. I wish I still had the code; it'd be interesting to know whether my change of opinion is due to more experience with stack languages, or changes in the language. Probably some of both.

jacquesm · on Sept 11, 2010

That's pretty cool.

I've gotten interested in stack machines again because of 'brick computing' (re-usable components with a tiny VM in them) and 'fabric computing' (networks of such components, either identical pieces or a variety of them).

It's quite amazing to see how little code you need to get a Forth like environment going on a piece of hardware.

pauldirac137 · on Sept 11, 2010

Factor has local variables (implemented as a library) so this problem doesn't really exist. However, Factor programmers try to avoid using locals as much as possible, so in reality the problem does exist. It's almost like some people have a religious aversion to locals. Locals do make factoring harder, and perhaps Factor code can be optimized better without locals, but I think that not having bunches of swaps, dups, rots, nips, tucks (or some of the even more complex Factor stack shuffling words -- you have no idea!) makes locals look very attractive in comparison.

thristian · on Sept 11, 2010

Well, the original article discusses how Forth's local variables don't really help; I assume the same applies to Factor's local-variable implementation.

Factor's higher-level, functional stack-shuffling words (cleaves and splats and what not) are definitely easier to understand than the low-level Forth shuffling words, but I find myself thinking very hard about how values need to be ordered on the stack.

jacquesm · on Sept 11, 2010

Forth is for people making little jewels that stand all by themselves, usually (as in the example) to get something (anything) high level running on new hardware.

That's a pretty unique niche. And for that niche it is virtually without competition.

But if you're going to write an accounting system in forth your a masochist, it's not meant for that.

I have no idea really what factors natural habitat is, but possibly over time it will come to supplant forth in this role.

Forth, factor and lisp (or scheme) are amongst the languages that are easiest to bootstrap, that alone gives them a right to exist.

wwortiz · on Sept 10, 2010

Factor really needs a book or more beginner documentation, that is in my opinion the main thing really holding it back as the visual repl and help browser are quite nice among other things.

kragen · on Sept 12, 2010

Here's my own rant. (I think Paul Dirac 137's rant actually contains more words than P. A. M. Dirac's textbook about quantum mechanics, which I spent many evenings in high school trying unsuccessfully to get to the third page of.)

I remember when I thought Forth was the next big language, too. And like you, my experience trying to program in it was disillusioning.

A big part of the problem, I think, is that I tried to use the stack instead of variables. Forth has always had variables (at one point in your post, Yossi, you say it doesn't; but I think you mean local variables.) It doesn't have to be any harder than C to write. You can translate directly from C to Forth; C:

    static int mac80211_hwsim_start(struct ieee80211_hw *hw)
    {
            struct mac80211_hwsim_data *data = hw->priv;
            printk(KERN_DEBUG "%s:%s\n", wiphy_name(hw->wiphy), __func__);
            data->started = 1;
            return 0;
    }

Forth, just a straight translation of the C:

    variable hw  variable data
    : mac80211_hwsim_start  hw !
      hw @ priv @ data !
      KERN_DEBUG s" %s:%s"  hw @ wiphy @ wiphy_name  printk
      1 data @ started !
      0 ;

This function doesn't happen to be recursive. If it did happen to recurse, we'd have to explicitly save the values of its "local" variables on the stack before the call and restore them afterwards. On the other hand, most functions aren't recursive.

Now, maybe you can optimize that Forth a little bit; for example, you can probably dispense with the variable "data" and just use the top-of-stack, and actually that variable is only read once and surely the debug message line doesn't modify hw->priv, etc. etc. Maybe I should have randomly picked a different piece of C. But at some point along the path of "optimizing" and "simplifying" the Forth code, you find yourself getting into the kind of nightmarish code you demonstrated above, where you have four or five things on the stack and they move all the time, and understanding it is just a nightmare. But you don't have to do it that way. You can use variables, just like in C, and the pain goes away. You never have to use any stack-manipulation operations, not even once. They're always there, tempting you to use them, taunting you; and you have to exercise a great deal of restraint to avoid writing your code as cleverly as possible, because that will make it impossible to debug.

I think in this case the painless-but-optimized version looks like this:

    : mac80211_hwsim_start
      dup wiphy @ wiphy_name  log   priv @ start ;

Here I figure that "log" is locally defined as something that printks a KERN_DEBUG message with the calling function name followed by a colon and then the argument, and that : start started 1 swap ! ;.

But when I said "it doesn't have to be any harder than C to write", I lied a little bit. It doesn't have to require detectably more code, but per line, Forth is still a bit more error-prone than C, in my experience. In C you get enough static type checking to immediately detect something like 90% of your type errors; you get a warning if you forget to pass an argument to a function, or pass too many; the data flow of any subroutine is easily approximable without having to know the stack effect of every subroutine you call; and so on. But maybe if I programmed enough in Forth, these errors would become less significant.

(I suspect that global variables are less of a problem in Forth than in other languages: you can have multiple global variables with the same name without accidental sharing; you're very unlikely to have mutually recursive functions without knowing it (and, in any language, recursive or mutually recursive functions require special care to ensure termination anyway (that is, recursion is error-prone); etc.)

On the other hand, as you pointed out, Forth is easily extensible. I think, as Eric Normand pointed out, that you want to use that extensibility to bootstrap into a less error-prone, more problem-level language/system as quickly as possible, and in fact, this is the standard approach using Forth promoted by the likes of Chuck Moore and Elizabeth Rather, as I understand it. It's just that the next layer up they're talking about is a more traditional domain-specific language, something like FoxPro or Rexx or sh, rather than something with garbage collection and data types.

In theory, at least, it seems like as you scale to a larger program, Forth's advantages over C would become more significant, as your program looks more like a tower of DSLs --- up to the point where you split your C program into multiple programs that the OS protects from each other.

There's a quote from Jeff Fox in my quotes file:

    [In C under Unix] Bugs are planned, and the whole picture is all about
    the planning for bugs.

    Forth is about planning for good code where the bugs don't happen. If
    you say BEGIN AGAIN damn it, you mean BEGIN AGAIN not twenty other
    possible meanings based on C insisting that it is one of twenty
    different bugs that need extra hardware and software to be handled
    properly. 

	-- Jeff Fox <fox@ultratechnology.com>, in a discussion on
	   comp.lang.forth, inadvertently explaining why Forth is not
	   widely used, 2006-05-20, in message-id
	   <1148149942.763594.292230@u72g2000cwu.googlegroups.com>,
	   subject "Re: hardware errors, do C and Forth need different
	   things in hardware?"

My own take on Forth is that it's by far the simplest way to build a macro assembler, and you can do anything with it that you can do with any other macro assembler, perhaps a little bit more easily and with more portability, and syntax that's not quite as nice. At some point you want a high-level language on top of your macro assembler, though. The Forth theory is that implementing a domain-specific high-level language has a better cost/benefit ratio than implementing a general-purpose high-level language, and a macro assembler is a perfectly adequate way of implementing a domain-specific high-level language.

Where this approach falls down is that it's true that implementing a domain-specific high-level language has a better cost/benefit ratio than implementing a general-purpose high-level language if you're implementing it for a single user, such as NRAO. But if your program memory isn't limited, and you can share the language with all the computer users in the world, a general-purpose high-level language like Lua or Python or Tcl or Ruby is a more economically efficient tradeoff, because whenever you implement some optimization, they all get the benefit.

(Also, I actually think it's easier for me to write bug-free assembly than bug-free Forth, but that may be a matter of experience.)

With regard to 18-bit words, I guess the advantage over 16-bit words is that you can fit four instructions per word instead of three. (The low-order bits of the last instruction are necessarily 0; fortunately that is true of NOP.)

Once I asked a famous CPU designer (who shall remain anonymous, since this wasn't a public conversation) what he thought about Chuck Moore. He said something to the effect of, "Chuck Moore? I used to work under him at AMD. He's great!"

"No," I said. "The Forth guy."

"Oh, HIM!" he said. "In my DREAMS I could do what he does."

I think the GreenArrays people are making a big mistake by marketing their chip as competition for microprocessors and microcontrollers. What they're really competing with is FPGAs and PALs. Unfortunately, they don't have a VHDL or Verilog synthesis system for their chips, and I don't think they're going to build one.

I think the reason that almost every bytecode system under the sun is stack-based is that low-level bytecode is basically a bad idea these days, but it made a lot of sense in days when code density was really important. (Bytecode as a compact representation of an AST might still make sense.)

And stack-based bytecode is a lot denser than purely register-based bytecode. Hybrid bytecodes like Smalltalk-80’s, where you have 8 or 16 or 32 bytecodes that fetch and store to registers, are even denser.

I can't claim to be a good or even an adequate Forth programmer. So take all of this with a grain of salt.