Hacker News new | past | comments | ask | show | jobs | submit login
Unix history and the `dc` calculator (howdytx.technology)
71 points by dcminter 5 months ago | hide | past | favorite | 57 comments



As the author of a `dc` (https://git.gavinhoward.com/gavin/bc), I appreciate this article!

The author is correct on the parsing of "normal" vs. Reverse Polish notation. My `bc`, with normal expressions, needs 1555 LoC for parsing. My `dc` needs just 298.

I have a couple posts ([1], [2]) about `dc` tricks and code on my blog.

And yes, I have an HP 48G [3]. :)

[1]: https://gavinhoward.com/2019/12/bc/dc-tips-and-tricks/

[2]: https://gavinhoward.com/2021/04/a-dc-script-for-easter/

[3]: https://gavinhoward.com/uploads/easter-dc-script/hp-48g.jpg


is your bc the bc or is it a bc? bc is a POSIX standard so "a" bc would be as good as "the" bc, i'm just curious. "man bc" on Fedora linux says the author is Phil Nelson


Depends on what platform you are using. Most Linux distros? Only a bc. FreeBSD and macOS? The bc. [1]

[1]: https://gavinhoward.com/2023/02/my-code-conquered-another-os...


iHP48 on iOS is excellent. It’s my mostly go to calculator on the go. I find the lack of a backlight on the legacy HPs to make usage difficult (plus I always have my phone).


After learning Forth years ago I became addicted to RPN and stack-based calculation, to the point where I wrote myself a calculator for Android that does just what I need it to (no existing calculators satisfied my Forth-like craving).

The really nice thing about stack based calculators that keeps me coming back to mine is that I can get all the numbers I'm working with down in one shot and then figure out which computations I'm applying in which order as a second step.

Until I tried RPN I never noticed how much mental bandwidth I was using shuffling numbers around in my head so I could input them into my calculator in the right order. With my stack calculator I just get all the numbers down and then swap and roll as needed to get the computation right. The number of times I've had to redo a computation because I started out wrong has been reduced to effectively zero, because I'm able to fully engage my math brain instead of juggling numbers at the same time.


i'm running gforth under termux on android. i just ran a video game on it earlier tonight, this one i wrote last month

do you have a screencast of your calculator's user interface? it sounds interesting!


I don't, but it's pretty simple: it looks like a regular calculator, but instead of having a single line for the number it has a line for building a number and then a scrolling text box that shows the stack [0]. There's an enter button that moves a number from the construction line to the stack, or you can hit any of the operation keys to implicitly use the under-construction number as the top of the stack. If no number is under construction it uses the top two numbers as the operands.

Short tap on the stack swaps the top two stack items, long press rolls the top three. If I were doing it again I'd have long press roll the entire stack (top becomes bottom, the rest move up)—since it's all visible and I'm not programming it this would actually be more useful.

[0] On my phone at the time I wrote it this box had room for three numbers before scrolling. Now it has 12...


i see! given your forth comment, i'm sort of surprised you didn't have r> and >r keys to move numbers between the stacks

i feel like on a multitouch screen we ought to be able to tap on numbers to roll them to the top of the stack


Yeah, there are definitely things to improve, but it's good enough for my use! Keep in mind that the phone I wrote this for had a 3–4" screen—there was barely room for one three-item stack, and no room at all for a return stack even if I wanted one.

Oh, yeah, what I'm calling Roll is actually Rot in Forth.


i think for this kind of thing it makes sense to pop up a low-pass-filtered translucent numeric entry keyboard on top of the values being displayed, the way touchscreen controls for games like fortnite and minetest work (check out handcam streamer videos on youtube if you haven't seen this). then you can use the whole display to display the numbers you're working on, so a 3–4" screen (4000mm²?) can display dozens of numbers. maybe the numeric entry keyboard should only be open while you're holding down a numeric-entry button with your other thumb, but anyway it shouldn't occupy screen real estate; it should overlay screen real estate

you called two different things 'roll'; one of them is indeed rot in forth but the other is depth 1- roll


You can launch GNU bc online using exaequOS: https://exaequos.com/?a=/usr/bin/bc (I am the creator of exaequOS). I will add dc as well since it is appreciated


GNU dc is now online in exaequOS: https://exaequos.com/?a=@Benoit/dc


DC one liners always seem like the most arcane bit of magic (to the uninitiated like me).

E.g. dc -e '_640320[0ksslk3^16lkd12+sk-lmlhd1+sh3^/smlxljsxll545140134+dsllmlxlnk/ls+dls!=P]sP3^sj7sn[6sk1ddshsxsm13591409dsllPx10005v426880*ls/K3-k1/pcln14+snlMx]dsMx'


For those who are on NixOS and can’t find `dc`, it’s present in BusyBox.

    nix run nixpkgs#busybox -- dc -e '_640320[0ksslk3^16lkd12+sk*-lm*lhd1+sh3^/smlxlj*sxll545140134+dsllm*lxlnk/ls+dls!=P]sP3^sj7sn[6sk1ddshsxsm13591409dsllPx10005v426880*ls/K3-k1/pcln14+snlMx]dsMx'


> For those who are on NixOS and can’t find `dc`

    $ nix-locate -r 'bin/dc$' | grep -o '^\w.*\.out' | xargs basename -s .out
    plan9port
    gavin-bc
    busybox
    bc
    _9base
:)


Haha, thanks! I had completely forgotten about `nix2` commands. I’ll try to remember this one!


nix-locate is actually a third-party command, shipped with nix-index: https://github.com/nix-community/nix-index

make sure to check out the pregenerated databases, they're very convenient


Thank you very much! Seems very convenient. :)


Cool, will save this. People should run this if they do not know what it does.


well, they are. it's like reading keyboard macros. btw i think you meant to \ your *s


Oh, you are correct. Hn ate the formatting. The example is stolen from wikipedia. Lets try that again:

dc -e '_640320[0ksslk3^16lkd12+sk*-lm*lhd1+sh3^/smlxlj*sxll545140134+dsllm*lxlnk/ls+dls!=P]sP3^sj7sn[6sk1ddshsxsm13591409dsllPx10005v426880*ls/K3-k1/pcln14+snlMx]dsMx'


FTA: “I'm especially proud of my little prompt macro“

I don’t understand why that has to be a macro. It asks for user input, so “because it has to be fast” can’t be the answer, can it?

I also don’t think it avoids some allocations. It’s creating two String’s, one for the call to read_line, and one to convert its trimmed result back to a String, is it?

What do I overlook?


If I remember correctly I've discovered RPN via an HP 49g+ and was very enlightened for it's quickness, not counting the "natural" equation editor, natural meaning that selections works like RPN stack. I do honestly not remember if I ever encounterd dc (while I've used bc, very casually mostly for computing partition boundaries and alike) but I still use M-x calc, which is actually RPN as well.

After this post I've looked a man and well, it's a bit limited, googled a bit I've seen it's powerful, with macros, registers, usable for interactive scripts etc, I think it's a remarkable piece of software from another era, not because of RPN witch is very valid today, but because it was designed to run on limited resources to a point of being long to learn, a thing that might be useful today for quickness, but I fail to se a case for dc, and in general is not needed.

A good example of old glories, nice for inspiring some modern design which tend to be crapplily complex on average... We can do much with very little, if we find clever ways to do so.


"This is a very simple algorithm, and it needs only very simple tools. Perhaps this is why dc, the UNIX RPN calculator, was the first utility ported to UNIX, way back in the PDP-11 days. It even predates the C programming language; originally it was implemented in B"

I'm pretty sure the version with v6 Unix was written in PDP-11 assembler. It's possible it was translated from a version written in B? It wasn't THAT simple, since it had to implement arbitrary precision arithmetic. I have a vague memory of it using a "buddy system" storage allocator, a scheme also described in Knuth volume 1. Maybe the code is still around somewhere.


Self-followup: yes, the v6 source code is here: dc1.s, dc2.s, etc.

https://minnie.tuhs.org/cgi-bin/utree.pl?file=V6/usr/source/...


  Reverse Polish notation, or RPN, removes the need for operator precedence.
This is true of prefix or "Lisp notation" as well. I would show you an operator precedence table for Common Lisp or Clojure, but we don't need such a table. With that said, one does need to keep in mind how function evaluation and macro evaluation differ. e.g. Consider (f (g (h x))). If f, g, h are functions, then (h x) is evaluated first. If they are all macros, then f is macroexpanded first with the body (g (h x)).

  ... with postfix operators, the only data structure you need is a stack.
This also applies to prefix. I wonder what the motivation was to introduce postfix notation and prefer it over prefix.


postfix notation only needs a stack of numbers, it's shallower, and it does one operation per token. zuse's z3 didn't have much memory or complex control logic

when you have said

   * + 3 4 - 5
(assuming dyadic operators) you have a stack containing an operator, a computed result, another operator, and an entered operand. when you complete the expression with

    6
the computation of the result, -1, should trigger the lurking multiplication. the machine's sequencing is state-dependent

by contrast, in

    3 4 + 5 6 - *
there are never more than three things on the stack in this expression, operators are never on the stack, and no input ever triggers more than one operation. from a user interface perspective, the user also gets faster feedback

i just wrote a command-line postfix calculator in c on my cellphone touchscreen in 20 minutes. it's 31 lines of code. i think that to do that so simply with prefix you have to do it right to left, effectively making it postfix


I vastly prefer postfix for interactive use. I can enter a bunch of numbers and then decide what to do with them. With prefix, I start by typing operators when I haven’t even fully decided which numbers I’m going to be using with them. Basically, prefix emphasizes the operations and postfix emphasizes the numbers.

Prefix is perfectly fine for entering formulas into code. In dc or any other scratchpad, I find postfix much, much more convenient.


They both remove the need for screwball operator precedence, but the lisp s-expression do it by putting everything in parentheses(or perhaps everything in function notation is the better way to put it) and RPN does it by making the stack an integral part of the mental model needed to understand it.


The parentheses are needed in LISP only because many functions have a variable number of operands, so that parentheses are needed to show where is the end of the list of operands.

In a language where all functions had a fixed number of operands, prefix notation (or function invocation in general) could be used without any parentheses.

Parentheses are also required for postfix notation when functions with variable number of operands are accepted (e.g. the sum of an arbitrary number of operands).


The parentheses are needed in Lisp because the input is data and not code. There are no "functions" and "arguments" in the data; that's a particular interpretation imposed on the data.

We wouldn't want the data notation to have an implicit shape which depends on the programming language semantics of the symbols which appear in it, like their arity.


Code is a special kind of data, i.e. the kind of data that can be the input of an interpreter or compiler.

A LISP program is not (arbitrary) data, i.e. it does not contain arbitrary lists. Giving data, i.e. an arbitrary list to any LISP interpreter or compiler will just result in an error message and a translation failure.

A LISP program contains only invocations of functions or macros (besides special forms) and each invocation has an associated list of arguments. While the classic LISP convention is to write the function/macro name after the opening parenthesis, this notation is equivalent with writing first the name of the function/macro followed by a list of arguments enclosed in parentheses.

Conceptually, data in LISP, which consists of lists of values, should have used a different kind of brackets for delimiting a list.

Because of the limitations of the character sets available in old computers, the round parentheses have a dual role in LISP, as delimiters for lists that are data and as delimiters for the lists of arguments used by function/macro invocations. The two roles are differentiated by the use of the QUOTE special form, which is just a workaround for not using two different kinds of brackets, which would more clearly differentiate LISP data and LISP code.

If all the LISP functions or macros had a fixed number of arguments, then the LISP code, i.e. the invocations of functions or macros, could have been written without parentheses. In that case, the LISP data, i.e. lists, could have been delimited by round parentheses without using the QUOTE special form, which would not have been necessary.


Yes, a Lisp program can potentially contain arbitrary lists.

It can directly contain arbitary lists or other objects as quoted literals.

It can use macros to create arbitrary syntax, which is then used.

(In mainstream dialects, macros cannot turn an arbitrary list into a form; it has to start with a symbol, after which it can be anything.)


LISP is short of "List Processor". It was developed as a language & system for processing lists. The programming language LISP was supposed to have M-Expressions (M -> Meta) for code and S-Expressions for data. For example a call to append two lists would have looked like this:

    append[listvar;(PARIS BERLIN NEWYORK TOKYO)]
The arguments are enclosed in square brackets. The second argument is written as an S-Expression. S-Expressions were the data notation and were enclosed in parentheses. Symbols were upper case. The names of operators and variables were lower case.

The append call then looked like this:

    (APPEND LISTVAR (QUOTE (PARIS BERLIN NEWYORK TOKYO)))
A conditional would have been:

    [eq[car[l];0] → cons[(ZERO);cdr[l]]; T → x]
Which in S-expression only syntax is:

    (COND ((EQ (CAR L)) (CONS (QUOTE (ZERO)) (CDR L)))
          (T X))
As you can see the conditional would have a special infix syntax.

> Because of the limitations of the character sets available in old computers, the round parentheses have a dual role in LISP, as delimiters for lists that are data and as delimiters for the lists of arguments used by function/macro invocations.

No, the syntax above wasn't actually implemented at first. The M-Expression/S-Expression combination was hand translated into S-Expressions, since the Interpreter and Compiler took lists as input.

The reason why the syntax is like this is not a lack of characters or so. The reason is because the language early on worked on code as list data and not on code as text. The input and output was then of list data. The famous "Read Eval Print Loop" reads data, evaluates it and prints the result in data format.

It was thought that a later version of LISP had the M-Expression/S-expression syntax as surface syntax. But that did not get any traction, because the S-expressions would be visible in the programming tools anyway: debugger, code inspector, code stepper. Thus it was more convenient to stay in the s-expression syntax, than to convert S-Expressions both from and into M-Expressions/S-Expressions.

For example a stepper for LISP code might look like this. :s is the single step command:

    CL-USER 8 > (step (plus (minus 10 20) (plus 20 30)))
    (PLUS (MINUS 10 20) (PLUS 20 30)) -> :s
       (MINUS 10 20) -> :s
          10 -> :S
          10 
          20 -> :S
          20 
          (- A B) -> :s
             A -> :S
             10 
             B -> :s
             20 
          -10 
       -10 
       (PLUS 20 30) -> 
and so on. The interpreter internally sees Lisp code as lists and just prints the lists while it steps the code...

If we wanted M-Expressions, then the things would needed to be converted everywhere/everytime, which is much less elegant than leaving everything in one notation.

LISP 2 was an effort to modernize LISP and to switch to a different syntax.


It is quite frequent for the creators of a programming language to not understand very well their creation, and this was especially true for the first programming languages, when there was very little experience and theory about programming languages.

How John McCarthy and his coworkers explained LISP between 1958 and 1960 does not necessarily match what LISP really is.

What you say about the history of LISP is correct, but it is not relevant for your thesis, that LISP required parentheses because supposedly LISP code is data.

As you say, LISP is a list processing language. Its main data structure is the list. A list may have a variable number of elements, therefore its notation requires parentheses for enclosing the elements of the list.

On the other hand, a LISP program is data neither more nor less than a BASIC program is data or a C program is data.

After parsing, a LISP program may be stored in lists, but this is an implementation detail that does not matter for the definition of the language. If desired, any program written in any programming language can be stored in lists, after parsing.

An arbitrary list is not a valid LISP expression, i.e. valid LISP code, even if it is valid LISP data. Moreover, LISP code cannot be provided as an operand to a LISP function that processes lists. Only LISP data can be provided as an operand. LISP data includes quoted LISP code.

Quoted LISP code is LISP data and it is of course stored as a list, but unquoted LISP code, i.e. real LISP code, is not normally stored as a list, except in the simplest and least efficient LISP interpreters, which do not have any practical importance.

So LISP code is not LISP data and LISP data is not LISP code. They have different syntax and interchanging LISP code and LISP data in a LISP program will normally cause a syntax error, exactly like interchanging the content of a string constant with a statement (i.e. writing it without quotation marks) will cause a syntax error in most programming languages.

In any programming language you can have a string constant that stores some program sentences or expressions, which is data that can be parsed as code at run time. The only special feature of LISP is that its syntax is very simple and regular, so parsing and processing at run time is trivial.

As I have said, LISP data requires parentheses, but LISP code is not LISP data and it does not require parentheses for being data. It requires parentheses only because most standard functions and special forms may have a variable number of operands. One could make a LISP dialect where all functions with a variable number of operands, like ADD, would be restricted to a fixed number of operands and where the same restriction would be applied to the special forms, e.g. only IF would be used instead of COND, while AND and OR would have two operands. In such a LISP dialect there would be no need for parentheses in LISP code, but the LISP data, i.e. the lists would continue to need parentheses (but QUOTE would no longer be needed to differentiate lists from function/macro invocations).


> An arbitrary list is not a valid LISP expression

In Common Lisp terms, an arbitrary list is a valid expression, but not necessarily a valid form. A form is an expression that occurs in a context where it is presented for evaluation.


> but LISP code is not LISP data

well, that's obviously wrong. There are so many examples where LISP code is LISP data. It's also just one example, a code for a theorem prover can also be LISP data. The whole idea of LISP is to be able to write interpreters, code transformations, rule engines, compilers as LISP applications which process code in the form of lists.

The idea of being a LIST PROCESSOR is not an implementation detail, it is the core essence of LISP.


RPN also means that the programming notation is only that. It doesn't give us a data notation, other than a flat sequence of words. This is why Lisp and Forth are not really comparable. In Lisp, you can write made-up words that don't have any previous definition (let alone operator arity), and clump them into whatever nesting you want. As long as you don't try to pass that through the standard evaluator, it's fine.


With postfix, numbers get pushed to the stack in the order you enter them and operators can be evaluated immediately. This way, intermediate results are always naturally available as you type an expression. On a HP calculator you'd usually see the intermediate results on-screen, but in dc you type p at any point to print the current top-of-stack.

Prefix notation in its simplest form is essentially the same thing but backwards, and you would only get intermediate results if you start writing from the innermost expression, which is complicated by the fact that that expression doesn't appear in the order you normally type.


When the expression is interpreted, not compiled, postfix has the advantage that you only need to memorize the yet unused operands.

With prefix, you have to memorize also the yet not executed operations and for each operation you need to memorize how many operands have already been provided for it, in order to execute the operation as soon as its last operand has been provided.

So for a compiled language it does not matter much whether an expression is written with prefix notation or with postfix notation, but for an interpreted language the latter is more efficient.


As an HP mini calculator user since the late 1970s I use dc as my favorite calculator on every terminal since the late 1980s. Annoyingly it is not installed on every server or embedded system.


At the computer my level of math escalation is the generic Mac desktop calculator.

From there I may shift to a handy emacs scratch buffer.

After that it moves to firing up Slime into CL. The * * and ** variables (which in CL equate to the 3 most recent evaluation results) is actually quite handy. Plus I just have the Slime history and all that.

None of those are RPN, but at this level I tend to not enter long equations anyway, and I have to do just as much precedence parsing cognitively converting algebraic to RPN or prefix anyway.


I love RPN! I got used to it while my Engineer-Studies in Germany, around 1992. I got an HP 48SX an it was a hard time to really understand this RPN. But when you checked it, it was boosting my Math-skills. Because to work properly with RPN you MUST understand what you are doing. it's the exact mimic of calculating with pen and paper. If you don't understand what you are doing, you are doomed. And so i have to understand what i was doing, which was very supportive to learn math the right way! By the way, the most funny part of using aa RPN machine was the fact thats Students which wantes to lean my Calculator gave it right back after trying to use the thing the algebraic way.


I am an avid bc(1) user, dc(1) I never could figure out. I think I will now spend time trying to get use to it.

cool article


Once upon a time, if you were an avid user of bc(1) you where an avid user of dc(1). This is an extract of the Unix V7 man page for bc(1) https://man.cat-v.org/unix_7th/1/bc :

    Bc is actually a preprocessor for dc(1), which it invokes
    automatically, unless the -c (compile only) option is pre-
    sent.  In this case the dc input is sent to the standard
    output instead.


I also use dc


ha! Nice article! I've never encountered another dc user let alone a blog post. Bravo.


You're not the only one!

I also use `dc`, together with `grpe` and `gits tatus`.


Actually as I recall back in the day ‘dc <dir>’ would actually do something. Nothing good, but something. Like it was trying to do math on a raw directory entry.


I see what you did htere


Hello fellow dc user! I use it as my default calculator and even in shell scripts I'm all "sum=$(echo $a $b + p | dc)" (though it is usually a much more complex expression of course)

I still have my HP48SX RPN calculator too!


> Hello fellow dc user!

What are the chances all four of us would find the same thread? It’s so fast, so reliable, so omnipresent. Between that and being able to turn multiplication into addition by working in powers of two, I get answers while my colleagues are still waiting for excel to start.


Ahaha :) Nice on. Its kinda similar to my net docs system written in ruby. I just type blgrep ip=192.168.0.10,15 hosts.txt and get answer, before people start any IPAM :) CLI power... Unfortunately, it does not get right respect these days.


I'm not sure why I started using dc instead of bc, but I think it might have been because at one point in time bc output a bunch of extra noise that made it annoying to use in shell scripts, so I just learned dc is the standard way to calculate stuff in UNIX and kept on using it that way. I noticed some new Linuxes don't have it installed and I'm always stuck for how to calculate things in shell scripts. I guess people just use Bash nowadays. In any case, thanks to the Gavin Howard dc I still use it today, even on the Windows command line. dc is great!


Also, unlike bc, dc can be used to write binary files.

(and it sure beats a magnetised needle and a steady hand)


I have a user on Windows? Gasp!

Glad you like it!





Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: