Was Knuth Really Framed?

pdpi · on Feb 27, 2020

Somebody, somewhere, wrote the Text::LevenshteinXS module. Somebody, somewhere, had to write sed, awk, head, sort, tr...

It's all fine and dandy to say "look at these awesome tools that make tasks like these trivially easy. See how powerful Unix is". But this fails to consider that somebody, somewhere, has to be a tool writer, not just a tool user. Knuth's code was a tool writer's code, exemplifying a technique (Literate Programming) that is aimed at long form code writers in general.

As with others before, the author fails to grasp that this is an apples to oranges comparison.

ssalazar · on Feb 27, 2020

Yeah. Honestly, this whole back-and-forth is exhaustingly trite and pedantic.

crispinb · on Feb 28, 2020

This is one of those stories I glance at briefly out of a kind of mildly masochistic fascination ("are people really going to argue about this" "you know damn well they are").

vilhelm_s · on Feb 27, 2020

I don't think there is such a sharp distinction between writing and using tools. You could take the 9-command pipeline, paste it into a shellscript, and now you have a new tool.

And isn't this how most programming is done? Typically your program relies on a set of libraries, but the methodology of writing the library looks very similar to the methodology of writing the client code; at the end of the day, it's all programming.

If literate programming is great, then I would expect it would be great everywhere. And if decomposing your program into a few steps and piping together already-written software for those steps is great, then I would expect you should try to use that style of programming as widely as possible.

taeric · on Feb 27, 2020

When the task was to make a tool, the criticism should not be that there are other tools to do this.

So, yeah. If you want a chair, you just go buy a chair. But if you are curious if a method of furniture making is good, you don't take as evidence someone that orders a chair from a catalog.

Izkata · on Feb 28, 2020

> you don't take as evidence someone that orders a chair from a catalog.

That's misrepresenting what's going on.

A better parallel is that the person curious about furniture-making still isn't doing everything from scratch. They're gonna go buy wood and tools - they're not creating an axe, hammer, screwdriver, screws, nails, chopping down a tree, and shaping the wood themselves. The furniture-maker is still following the unix philosophy of using and building on prior work.

taeric · on Feb 28, 2020

Depends how you read it. He offered to demonstrate a tool, literate programming. The task he was asked to demonstrate against was counting words. He was not asked to count words, that is different.

It was a purposely pedagogical exercise to demonstrate literate programming. Which is itself a pedagogical method to write a program.

To that end, reading his program you can learn about how he wrote his program. Reading the shell script can really just learn how the program was written. If you don't know what those commands do, you will not learn it from the script. I don't know Pascal, but most of the other still makes sense to me. Such that I could probably port it. The shell script?

Now, again, if the task is just to count the words, you are probably fine with whatever crosses the finish line. If the task is to demonstrate literate programming? How does that help?

JadeNB · on Feb 27, 2020

> As with others before, the author fails to grasp that this is an apples to oranges comparison.

I think comparing apples to oranges would be comparing Knuth as a programmer to Spinellis (the author of the blog post) as a programmer; they obviously have different approaches to programming, which one can't compare without bringing in some externally imposed value judgements.

The post itself isn't a comparison at all: it says "this task, which was claimed to be very difficult using only a certain set of tools, is actually easier than one would expect even when restricted to those tools." (OK, I guess 'easier' is a comparison, technically, but only to expectations.) This says nothing about the quality of Knuth, or of Spinellis, or of the ideologies of tool use and tool creation; it says only that the existing Unix toolkit is very rich, much more so than some might expect.

themodelplumber · on Feb 27, 2020

Those authors (of the Perl module and other software) may have worked from existing source material from other authors, too. In fact I'd guess it's pretty likely. Research into this could even promote a much more inclusive and informed approach to technology use and development.

(Until then though, perhaps we'll keep having this use-your-own-brain vs. don't-reinvent-the-wheel discussion)

crystaldev · on Feb 27, 2020

Just like apples and oranges can be compared as foods (and fruits), bespoke vs. reused solutions can be compared so software. The reused parts are of higher quality than their corresponding bespoke implementations and can be composed to accomplish the same task as well as many others. It's a powerful lesson.

michaelmrose · on Feb 28, 2020

I believe another poster earlier reported that knuths solution was 140 times faster besides being applicable to building tools like the Unix tools.

JadeNB · on Feb 27, 2020

> The reused parts are of higher quality than their corresponding bespoke implementations ….

I think that stating this as an inevitable and uncontested fact might be pushing the point.

crystaldev · on Feb 27, 2020

Most of the components are multi-platform, partly or fully POSIX-standardized, battle-tested, blazing, fast, etc. It's some of the most widely-used and arguably greatest software ever written.

nyberg · on Feb 28, 2020

And some of it is absolutely terrible with several tools overlapping in use cases along with unclear perfornance hints. Even worse is the combination performing worse which may not matter on small inputs but has a huge impact when you crank up the size. Then there's the issue of understanding why performance suffers and in what cases it's better to roll your own custom solution that better fits the problem.

Reuse saves work when your problem maps perfectly to existing tools while what may seem like a minor difference will propagate down your program to end in countless issues from a codebase you don't know.

ncmncm · on Feb 28, 2020

It takes real stones to suggest Unix utility code is better quality than Knuth's handicraft. And, not having looked into much of it. Unix code usually works well enough on unchallenging input.

zbentley · on Feb 28, 2020

> Unix utility code is better quality

What's your index of "better quality"?

By some indices (simplicity, documented-ness, accessibility {as in: I can read and understand it without learning more than one language's behavior vs. bash/C/several others}), Knuth wins hands-down.

By others (generality, speed of implementation) he does not.

ncmncm · on Feb 28, 2020

It is normal to point out at this point that if correctness is not important to you, an overwhelmingly more quickly produced implementation is possible.

But my comment was on remarks claiming superior reliability for the Unix utilities, which you have not addressed.

smoyer · on Feb 28, 2020

I came here to basically say the same thing ... if Knuth weren't demonstrating Literate Programming (LP), he would have used a library version of the Trie function that took up most of those eight pages. At that point, who cares whether those six pages were done in (LP) or not ... it might be easier to maintain in the long-term but either way it's code you don't have to write. If he was finding the Levenshtein distance as with this post, maybe his LP system could use the Perl library too?

svat · on Feb 27, 2020

I'll avoid repeating my comments from the previous thread (https://news.ycombinator.com/item?id=22406070) and say just the ones that I think didn't get enough attention:

1. I can't speak for Hillel Wayne who used the word "framed", but I didn't understand his newsletter post as Bentley having "framed" Knuth -- I understood his post as pointing out that in the popular imagination/folklore, the story had mutated over the course of years from the original setting (a program that Knuth was asked to write in WEB specifically as that was the point, and a review of that program by McIlroy evangelizing the then little-known Unix philosophy) to a "framing" where two people were competing to solve the same problem with the same available resources, and one of them did it in a "worse" way. (Also left this comment on the blog post above.)

2. Here's a comment on the previous thread from someone who says they read the column when it was posted, and their reaction they say was one of cringing -- so at least at that time it probably wasn't perceived that way: https://news.ycombinator.com/item?id=22418721

3. Much of the space taken by the literate program is for explaining a very interesting data structure that we could call a hash-packed trie (AFAICT, devised on that occasion for that problem -- a small twist on the packed tries used in TeX for hyphenation, and described in the thesis of Knuth's student Frank Liang). One cannot obtain this data structure by combining other programs, only by combining other ideas. (I mentioned this in the previous thread as well: https://news.ycombinator.com/item?id=22413391)

4. So as far as evaluating literate programming goes, the real question (and the answer is not obvious to me!) is: if you're going to write a program that uses a custom data structure (like this), how should you organize that program? Should you write it as Knuth does, or as a conventional program (like I tried to do with my translation: https://codegolf.stackexchange.com/a/197870)? And as for estimating the value of a new data structure in the first place: as of now (at that question), solutions based on a trie are about 200 times faster than the shell pipeline, on a large testcase. (The hash-packed trie, which Knuth calls "slow" in his program, is not so bad either, and it does economize on memory a bit.)

mikekchar · on Feb 28, 2020

I have my own answer for #4 (which, to me, is the only interesting question about this affair). I've actually done a fair amount of literate programming on my own, although I only have a couple of examples that one can look at these days. Here is a small library for fluent matcher system for Jasmine and React: https://github.com/ygt-mikekchar/react-maybe-matchers/blob/m...

You will see that I've included yet another monad tutorial :-) I don't link to this as a way of saying that I think this is a good example of LP. It's not really. I was experimenting quite a lot. However, I can tell you one thing about it: it is practically impossible to refactor.

As a result, I decided that LP is not particularly good for working on living programs. Or, at least, it is not conducive to my style of programming, which encourages refactoring. Nothing I write is "frozen". It is all in flux and so the value of documentation is transient. Additionally, it is rare that a programmer wishes to read code from the top to the bottom. If they ever do, it's usually the first time they have read the code. After that, they will want fast access to the parts that they want to modify. Sorting out the code from the text becomes difficult. If you make a change, you also have to review all of the text to make sure that you haven't clobber something that is referenced elsewhere. It will work well for something short, but it's not great for large projects.

I still do LP style things. Here is an unfinished blog post on ideas about OO: https://github.com/ygt-mikekchar/oojs/blob/master/oojs.org However, to contrast with this, I would invite you to look at https://gitlab.com/mikekchar/testy where I put some of those ideas into action (especially see the design.md and coding_standard.md documents to show what constraints I chose in this experiment). Crucially, after this code had run its course, I'd changed a lot of my ideas and never went back to my blog post. For me, the actual code is far more instructive than the blog post ever was. Of course, I'm the author, so I understand what I was trying to say and I only need a quick peek at the code to remind me what I was thinking.

For me, that's the dilemma of LP: once you know what you want to know, the text is in the way. New people will benefit from the Gentle Introduction (sorry, couldn't resist the TeX reference...), but 99% of the time nobody will benefit from it. Is the other 1% of the time worth it. It may be, actually, but boy is it hard to convince yourself of that!

svat · on Feb 28, 2020

Thank you, the voice of experience counts for a lot, and I'm glad to hear from a rare person who has actually tried LP seriously (I'm not one of them!). I'd like to dig deeper for your thoughts on a couple of your interesting points:

• Living programs: You mention the point that you find LP hard to refactor, because things written tend to feel "frozen". But writers do often mention ripping out several chapters of their books or carrying out extensive rewrites in response to editors' feedback etc. (Though some don't: look for the second mention of "Len Deighton" in this wonderful profile of the editor Robert Gottlieb: https://web.archive.org/web/20161227170954/http://www.thepar...) Conversely, for those of us without much writing experience, I wonder whether literate programming may train us to become better writers, in the sense that programming (which inevitably tends to require rewriting) may make us more comfortable with doing major rewrites of our work. (Or at the very least, train us to chunk our code in a way with an eye to which parts might likely to be changed together later, which otherwise in code may be far apart.)

• Linear reading versus fast random access to code: I think it's very much true that after (or even during!) the first reading, one wants fast access to relevant sections of code, and not to read it from top to bottom. But books are also designed for random access. (The first piece of advice here: https://www.cs.cmu.edu/~mblum/research/pdf/grad.html) Many of the contrivances of Knuth-style LP (the cross-references, the indexes, the table of contents, the list of section names at the end, for that matter even section numbers and page numbers) seem designed to facilitate this. (See the TeX program at http://texdoc.net/pkg/tex especially the ToC on the first page and the two indexes at the end; the printed book also has a mini-index on each two-page spread, which is missing here.) In fact, I'd imagine that even if all that you used LP for was to organize the code in a way that better facilitates random access (e.g. just add section names to your code blocks, or move error-handling code to a separate section to be tangled-in later) it alone may prove worth it.

• Documentation versus code: In one of your examples, you seem to be writing exposition / documenting the (user-level) purpose of the code at the same time as programming. Do you find this to be the case often? My experience with LP is mainly with attempting to read the TeX program, which on the first page says "[…] rarely attempt to explain the TeX language itself, since the reader is supposed to be familiar with The TeXbook" (the TeX manual). And for the most part, whatever text is in the program is about the code itself, things that still matter once you know the program already. (This is in fact my struggle with it, it's not written like a novel; all the text is oriented towards details of the program code itself.) As that's a large example, pick a small one like this: https://github.com/shreevatsa/knuth-literate-programs/blob/m... -- there is an intro page about the problem and cache size etc., but most of the rest of the text seems comparable to what one might write as a comments even if not doing “literate programming” as such. So the main difference LP is contributing seems to be with code organization (what one might otherwise do with functions). In fact, probably most of us modern programmers wouldn't consider it the best way to organize this program, but it's interesting to consider what the author's intent may be with organizing code that way.

zimpenfish · on Feb 28, 2020

> Here is a small library for fluent matcher system

It doesn't seem like Literate CoffeeScript lets you reorder the code blocks which, as I understand things, is the fundamental part of Literate Programming - code follows documentation, not the other way around.

(Although, to be fair, 99% of things I've seen labelled as LP aren't either. There's only WEB, CWEB, and noweb I can think of that'd count just now.)

mikekchar · on Feb 28, 2020

Yes you are absolutely correct and it's definitely a big problem from an LP perspective. Babel gives me a bit more leeway, though. But from the perspective of "is this worth it", not having it reordered actually makes it easier to work with, in my opinion. I think if you had tools that allowed you to work with the generated code and be able to jump back and forth to the sources it might be OK.

shaggyfrog · on Feb 27, 2020

> perl -a -MText::LevenshteinXS -e 'print distance(@F), "\n"'

Step 2: draw the rest of the owl

jiofih · on Feb 28, 2020

The rebuttal itself is a lot more comical than any joke: literally countering Knuth’s “code should read and flow like prose for full understanding” with “f you, look at what I can do with my 20 years of hacking and thousands of lines of code I’ve never seen, written by someone else”.

reggieband · on Feb 27, 2020

In case anyone doesn't know this meme: https://knowyourmeme.com/memes/how-to-draw-an-owl

coolreader18 · on Feb 27, 2020

I really like the idea of the Unix model as well, but you're not going to be able to use it effectively to write an actual application. If you're writing a word processor and you need to find the levenshtein distance between the most frequent word pairs (maybe some measure of how alliterative/consonant/assonant your document is?) then you're probably not going to be building the word processor using the Unix model, and even if you are (the closest you can get is probably using Tcl/Tk?) then it's still best to write out what you're doing as clear as possible. Note that it took me about 5 minutes to figure out what the shell pipeline presented in the article actually does, and multiple times my reasoning about it lead me to think "wait, does this actually do what it's supposed to do?"

chongli · on Feb 28, 2020

A word processor is anti-Unix on its face. If you want the Unix equivalent, look no further than vi and TeX. With vi you can pipe your document through a spellchecker such as gnu aspell [1].

[1] http://aspell.net/

iamapipebomb · on Feb 27, 2020

acme could be seen as a word processor in the unix model. It's easy to pipe text to composable commands.

I can put the Levenshtein perl one-liner in the blue bar and middle click it to get the distance of two highlighted words.

pjmlp · on Feb 28, 2020

ACME is a word processor in the Oberon System model, which has very little to do with UNIX model and plenty with Xerox GUI based REPL environments.

samatman · on Feb 27, 2020

Thank you for adding to the considerable weight of literature completely missing the point of Dr. Knuth's literate software, which is:

tr, sort, uniq, and sed, should all be literate programs.

They would be easier to read, reason about, modify, and extend. At this point, tooling for literate programming lags considerably compared to illiterate programming, and that's entirely because of the determination to miss the point exhibited here.

Too bad really.

shp0ngle · on Feb 27, 2020

As I wrote in other thread - try to read the TeX source

https://mirror-hk.koddos.net/CTAN/systems/knuth/dist/tex/tex...

and compare it with coreutils source code

https://github.com/coreutils/coreutils/blob/master/src/tr.c

what’s easier to read and understand?

samatman · on Feb 27, 2020

There are several orders of magnitude in complexity between tr and TeX, making such a comparison fruitless.

That said, have a gander here: http://brokestream.com/tex.pdf

Not nearly as good as the hardcover, which has a proper table of contents and index.

The key is to imagine 40 years of progress along these lines. I can't imagine our default target would be paper.

kevin_thibedeau · on Feb 28, 2020

The PDF still doesn't help much. The expositionary style of breaking out inner code blocks from their call site harms the ability to understand what's happening. It's nearly impossible to follow in the raw source. Hyperlinks don't improve matters much and the PDF rendering doesn't have rational layout for details like numeric tables.

Try unraveling the numeric code in Metafont:

http://tug.ctan.org/tex-archive/systems/knuth/dist/mf/mf.web

http://www.tug.org/texlive//devsrc/Master/texmf-dist/doc/gen...

jml7c5 · on Feb 27, 2020

The first implements a programming language and typesetting system, while the second just swaps characters. I'm not sure it's a fair comparison. (Additionally, the TeX source is also meant to be formatted, not read raw.)

acqq · on Feb 27, 2020

The TeX source is not in the form in which it is intended to be read. It's like you showing the current HTML and JavaScript source code of some article and complain that the message is hard to read.

What you showed as the TeX source is at the same time a source representation for this book:

Computers & Typesetting, Volume B: TeX: The Program (Reading, Massachusetts: Addison-Wesley, 1986), ISBN 0-201-13437-3

and at the same time the "plain" Pascal program can be extracted from that same source representation(1)

That was the idea of Literate Programming that Knuth also tried to demonstrate in his article as he was "framed."

Which other program that is hard to develop (that one took 10 years of the best programmer in the world, supported by his students and assistants) has a nicely printed book form that fits 600 pages and has all the descriptions?

Even more impressive, Knuth intentionally developed his program with the specific idea that its outputs are the same no matter how much the computers change in the future. And he managed to achieve this -- the more ports of his original program are available everywhere and using his sources from eighties produce exactly same pages as then.

------------

1) "WEB programs are converted to Pascal sources by tangle and to a TeX input file by weave. Of course, tangle and weave are WEB programs as well. So one needs tangle to build tangle---and weave and TeX to read a beautifully typeset WEB program" -- that is, if you don't buy a book which is already typeset and printed.

thesz · on Feb 28, 2020

From my point of view, it is second.

Also, from my point of view, I am MUCH MORE conditioned to read second variant of code.

Third point of mine is that TeX source must be read in PDF form, not in the TeX form. I have the courage to extend that that TeX source code must be manipulated in PDF (read: readable) form, not as the source code per se.

benibela · on Feb 28, 2020

The syntax highlighting helps a lot

> > for (size_t i = 0; i < bytes_read; i++) > buf[i] = xlate[to_uchar (buf[i])];

TIL, tr does not support utf8

knodi123 · on Feb 27, 2020

From TeX:

> last:=first; {cf.\ Matthew 19\thinspace:\thinspace30}

har har har { eyeroll }

lonelappde · on Feb 27, 2020

Literate programs are essays. They might be easy to read and reason about, but not modify or extend. A computer program is not best understood and managed as a linear artifact. Much of it's power is in graph nature.

Documentation comments are great, but that's not the same as literate programming.

gmfawcett · on Feb 27, 2020

I don't think a literate program is more or less linear than the source code that is extracted/tangled from it. Both artifacts have a sequence: for a C program, the tangled version would put the #includes before declarations before definitions, for example.

In that sense, the LP program is an alternate linearization of the program, in that the authors can choose the order in which to introduce the program. But few LP programs are naively linear -- they typically impose a tree structure on the code, made up of labelled sections and subsections. Readers don't have to start at line 1 of the program/essay, they can navigate from the table of contents to the section of interest.

A compelling argument for LP is that it's an additive technology. If you don't want to read the essay, that's fine -- just tangle the code, and read the source-code artifact instead. With the right tooling (which admittedly may not exist!) an IDE could let you edit the tangled version directly, and put your edits back into the "essay" at the right places, so round-trip editing would be feasible.

Jtsummers · on Feb 27, 2020

I think I understand part of the problem. Many "literate programs" aren't literate in Knuth's sense. they are merely inversions of the conventional model. Where text is the default and code is the special case that has to be demarcated. Things like literate markdown I've seen which typically read like a regular program with extra text:

  # A Literate Program
  This is a literate program, the language is C. We'll
  begin with the includes because that's what C has at
  the start of every C file, and not because it makes
  any sense for the presentation:
  ```
  #include <stdio.h>
  ...
  ```

  Here are the declarations, you can ignore these for
  now.
  ```
  int main();
  double square(double x);
  ```

  Now that that's out of the way, ...

If that's all most people see then they haven't actually seen the benefit of LP. Where you can push that boilerplate stuff to an appendix so no one has to see it unless they're changing the libraries used by the system or some other thing that's important, but less essential to the understanding that LP tries to promote.

gmfawcett · on Feb 27, 2020

That's an excellent point. I said "most literate programs aren't linear" in my comment... But I wasn't considering the low-effort linear style that many people actually use, so I'm probably wrong on that. :)

"Low-effort linear literate" is a useful style, but I think it falls quite short of what Knuth had in mind.

samatman · on Feb 27, 2020

I refer to this as semiliterate programming.

I've found it to be a useful bootstrap toward a properly literate environment, which will require considerable tooling support to provide a reasonably modern experience.

Happily, we have the Language Server Protocol now, so many of the key components are already in place...

samatman · on Feb 27, 2020

Exactly.

It's not fair to compare 40(!) years of advances in tooling surrounding the pile-of-files approach to software, to the somewhat withered on the vine approach embodied in literate programming.

It's a road not taken, and I think that's a pity, so I'm doing something about it <shrug>

gmfawcett · on Feb 27, 2020

I hope you'll post something about your progress. It's not an easy problem to tackle, but hopefully a fun one. :)

samatman · on Feb 27, 2020

I hope to have a Show HN ready by summer!

jimbokun · on Feb 27, 2020

> they typically impose a tree structure on the code, made up of labelled sections and subsections.

But most non-trivial programs don't have a tree structure, but are a full directed graph. Methods calling other methods, classes inheriting from other classes or implementing interfaces, etc. Programming and debugging could involve traversing and modifying this graph in almost any order, and does not lend itself to one preferred linearization.

gmfawcett · on Feb 28, 2020

An LP style that somehow reflected the various graphs of a program (control flow, inheritance, etc.) might be very interesting! I'm not sure what it would look like, but it sounds like a starting point for experimentation.

A big program could be broken into modules -- as we already do -- and each module (or its sub-modules) could be documented in a literate style, independently from the other modules. Maybe the graphs of the program (or at least, the graphs of its highly-connected modules) could be presented as alternate trails, indices, tables of contents, etc., each with its own accompanying narrative overview. It sounds gnarly, but not impossible! On the other hand, losing linearity altogether seems to be an anti-goal for an inherently narrative programming style like LP. If you're telling a story (about code, or anything else), eventually you have to put one sentence before the next. At some level of granularity, you have to commit to straight lines.

When he wrote his book, it's clear that Knuth was very much aware that LP was an unusual, and possibly crazy, idea. The book is written in a humble style, like an invitation to explore a design space with him, and not as a prescriptive text. For example, I think the fact that he included McIlroy's full critique in the book speaks to his intentions. I guess my point is that Knuth would probably love that you're challenging his ideas and exploring the design space with him, rather than dismissing LP outright.

dragonwriter · on Feb 28, 2020

> But most non-trivial programs don't have a tree structure,

Pretty much all programs have at least one tree structure that covers the entire program (the AST), though that may or may not be the most interesting view of the program structure.

nyberg · on Feb 28, 2020

One such program that I know of and use in conjunction with pandoc is enTangleD[1] which supports editing on both at the cost of comments embedded in the source to keep track of blocks. All that's really required is folding of the comments to get half way to the desired IDE.

[1]: https://entangled.github.io

Jtsummers · on Feb 27, 2020

> Literate programs are essays. They might be easy to read and reason about, but not modify or extend.

What's your rationale behind that last part? Have you ever used any notebook-styled interface like with Mathematica or Jupyter? It's perfectly feasible to tear out a chunk of material and replace it with something else, if you've organized it well. This is no different than the same constraint for conventionally written software. You can't easily refactor shitty code. You can't easily refactor shitty literate code. If you organize your code well and write quality code, whether in the literate or conventional style, you can refactor it with relative ease.

> A computer program is not best understood and managed as a linear artifact. Much of it's power is in graph nature.

And how does that conflict with literate programming? Literate programming permits the reorganization of code into any arbitrary structure. Which means that the graphical nature of the code can be made even more obvious than in most conventional languages. You're no longer bound by single-module files (see Java) or other arbitrary textual constraints. You can place the code in the place that makes the most sense for explication. Or put it adjacent to where it's used, even if it ends up tangled in a different file.

neves · on Feb 27, 2020

BTW, Spinellis MOOC about unix tools must be great: https://www.edx.org/course/unix-tools-data-software-and-prod...

I'm really a fan of Spinellis, his books are excellent: https://www.spinellis.gr/pubs/index.html#book

Effective Debugging book is a must read for any software developer. The Elements of Computing Style is useful for any knowledge worker. Code Reading is probably the only important book in the subject.

The only problem with his books is that they are rather expensive, specially for developers who doesn't earn in dollars :-(

DSpinellis · on Feb 27, 2020

That is indeed a pity. I try to compensate by making as much material as possible openly available, such as through the MOOC you mentioned (I've been working for five years on it), through my blog, and through open source software and content.

virtue3 · on Feb 27, 2020

He, self admittedly, slightly cheated by evoking perl to do the "hard" bit.

TFA is not so much about Knuth but mostly about unix being highly capable.

benchaney · on Feb 27, 2020

Yes he was. The modification to the problem being discussed here is really besides the point.

robertlagrant · on Feb 27, 2020

If I can use Perl, I can do it in one line of bash.

jeremyjh · on Feb 27, 2020

The author did it in one line of Perl, using an existing library. How is that different from using awk? Yes awk is widely deployed but so is CPAN. In any case deployment isn’t part of the argument for using the UNIX philosophy.

zpallin · on Feb 27, 2020

The original bar set by Doug Mcllroy was:

> ... [a] more practical, much faster to implement, debug and modify solution of the problem takes only six lines of shell script by reusing standard Unix utilities.

Unlike tr, sort, uniq, awk and tr, perl is not a standard Unix utility. Not only that, but Text::LevenshteinXS is a plugin that must be downloaded.

It's still far more convenient than Knuth's work, and it follows Spinellis' reasoning about the Unix mindset, but Spinellis' Levenshtein example doesn't actually support Mcllroy's original argument.

08-15 · on Feb 27, 2020

What exactly is the UNIX philosophy?

As far as I can see, it's roughly "Data structures are hard, so let's pretend everything is ACSII text. Now we can use a really difficult systems programming language (C) to build functions with weird calling conventions ("tools") and glue them together with an awful scripting language (sh)."

'awk' fits into this framework awkwardly. It implements a restricted pattern (go line-by-line, match actions to lines), it doesn't want to be a full programming language, even though it really is.

But 'perl' is a programming language, and it wants to be one. Once you have 'perl', what is the point of using a reasonable scripting language (perl) to build functions with weird calling conventions and gluing them together with an awful scripting language? You're better off writing functions(!) with normal calling conventions (a library) and gluing them together using the good scripting language.

That logic taken taken to its conclusion replaces the shell with a clean language, encourages libraries instead of "tools", and embeds the 'awk' pattern into said language instead of relegating it to an incomplete secondary scripting language. In one word: 'scsh'.

rifung · on Feb 27, 2020

> What exactly is the UNIX philosophy?

I believe it is the idea of writing small tools focused on doing one thing well with reusability in mind as opposed to writing larger complicated tools that do multiple things.

https://en.wikipedia.org/wiki/Unix_philosophy

pjmlp · on Feb 28, 2020

A cargo cult philosophy never adopted by commercial UNIX clones and adored by UNIX FOSS, where the man page of each GNU tool describing the available set of command line arguments looks like a kitchen sink.

dang · on Feb 27, 2020

The previous episode: https://news.ycombinator.com/item?id=22406070.

giancarlostoro · on Feb 27, 2020

According to a sibling comment this is the follow up to this post. Sibling comment here:

https://news.ycombinator.com/item?id=22436816

dang · on Feb 28, 2020

Sorry, I'm afraid I don't understand.

pjmlp · on Feb 27, 2020

> In my everyday work, I use Unix commands many times daily to perform diverse and very different tasks. I very rarely encounter tasks that cannot be solved by joining together a couple of commands.

Others just use a REPL instead, where tr, sort, uniq, and sed get to be function calls with a threading macro.

chaps · on Feb 27, 2020

Meh, a shell is just a REPL and a pipe is just a threading macro. What's your point?

pjmlp · on Feb 27, 2020

The UNIX shell is a primitive REPL, without the capabilities of the REPLs developed at Xerox PARC, TI and Genera, regarding structured data, debugging tools, function composition, inline graphics, ability to directly interact with OS APIs.

A chariot in the age of cars.

chaps · on Feb 28, 2020

A car without infrastructure is just a fancy box. A chariot without infrastructure is a rideable horse.

A rideable horse in the age of broken, disparate, infrastructure.

But, at the end of the day, it all depends on what you're trying to accomplish. I use repls, shells, notebooks, etc, on a regular basis. Unix tools solve some problems. Repls solve other problems. Notebooks another. What's important, to me, is to be able to be able to make the most out of them all, despite their flaws, because they're simply the tools that we have in our toolchain. It would be a shame to not learn our own tools, when they can offer us so, so much.

pacala · on Feb 27, 2020

A repl for a sane language...

chaps · on Feb 27, 2020

No need for language snobbery - the sanity of the "language" isn't what's being discussed here. We're talking about the capabilities of of unix tools within domains where they'd be used. If you want to use a repl within your domain, that's your choice, but understand that in doing so, you're working with a relatively limited domain compared those within the reach of unix binaries.

senderista · on Feb 27, 2020

The threading macro just represents function composition, while a Unix pipe represents buffered streaming I/O (or composition of dataflow operators, if you like). Two related but quite different things.

pjmlp · on Feb 27, 2020

Function composition can make use of any kind of streaming, better yet, with a debugger at disposal.

The UNIX shell is just a primitive REPL.

samatman · on Feb 28, 2020

It can, but only at the cost of complecting source and sink.

By default, function composition is eager: a function is expected to do its work, and hand the whole return value off to the next function.

We can make this lazy, by setting up an iterator and handing this off. At some expense: our next function must expect an iterator, and therefore can't handle a full data structure. At minimum it must coerce those into an iterator when encountered.

Also, it gets awkward to reason about iterators wrapped in iterators wrapped in iterators, even with a debugger, you get action at a distance, where the fourth function in your thread is failing because the first iterator of three has a flaw in it.

Shell pipes handle all of this for the user, with sensible defaults which can be overridden and modified for special cases. It's a powerful abstraction and I wish more languages offered something like it.

kazinator · on Feb 28, 2020

Laziness doesn't require iterators; it can be based on lazy data structures such as lazy lists.

  This is the TXR Lisp interactive listener of TXR 232.
  Quit with :quit or Ctrl-D on empty line. Ctrl-X ? for cheatsheet.
  1> (defvar N (range 1))
  N
  2> (take 5 N)
  (1 2 3 4 5)
  3> (typeof N)
  lcons
  4> (set (car N) 42)
  42
  5> (take 10 N)
  (42 2 3 4 5 6 7 8 9 10)

jimbokun · on Feb 27, 2020

Would be interesting to see a comparable solution to these tasks in Lisp.

kazinator · on Feb 28, 2020

Using TXR Lisp, obtain the list of words in /etc/fstab on a Ubuntu 18 system, sort them to get identical words into groups which are represented as sublists, then sort by descending length of sublist (i.e. frequency), take the top ten, and turn that into word-frequency pairs:

  This is the TXR Lisp interactive listener of TXR 232.
  Quit with :quit or Ctrl-D on empty line. Ctrl-X ? for cheatsheet.
  1> [(opip (open-file "/etc/fstab")
            (record-adapter #/[^A-Za-z]+/)
            get-lines
            sort-group
            (sort @1 greater len)
            (take 10)
            (mapcar [juxt car len]))]
  (("defaults" 5) ("a" 3) ("dev" 3) ("ext" 3) ("home" 3) ("opt" 3)
   ("UUID" 2) ("c" 2) ("d" 2) ("e" 2))

Pretty print the list obtained from prompt 1:

  2> (mapdo (do put-line `@(car @1) -> @(cadr @1)`) *1)
  defaults -> 5
  a -> 3
  dev -> 3
  ext -> 3
  home -> 3
  opt -> 3
  UUID -> 2
  c -> 2
  d -> 2
  e -> 2
  nil

DSpinellis · on Feb 27, 2020

Can you illustrate this by providing a concrete example for one of those two problems?

somesortofsystm · on Feb 27, 2020

I think its entirely fair to say that Knuth was in a frame to demonstrate one thing, and implement something second. That he was ideating about the subject didn't - necessarily - prevent a successful implementation.

Certainly, as another set of eyes, the lower character count matters most, though.

musicale · on Feb 27, 2020

Everyone knows Donald Knuth would have finished TAOCP faster if he only understood how to use UNIX pipes.

GedByrne · on Feb 28, 2020

I want to read the follow up article where the challenge is to create a typesetted document. Bentleys criticism includes a single like Shell script invoking Latex.

jagged-chisel · on Feb 27, 2020

“Through this demonstration I haven't proven that Bentley didn't frame Knuth; it seems that at some point McIlroy admitted that the criticism was unfair.“

kencausey · on Feb 27, 2020

I feel like 'by Jon Bentley' should be restored to the end of the title to help differentiate it from the earlier posting to which this is a reply.

scythe · on Feb 27, 2020

To me the question doesn't depend so much on whether Knuth was "framed".

The meaningful criticism leveled at Knuth's code was that it was monolithic. It's true that it was long because he wrote it from scratch, but that's not enough to force you to be tightly coupled.

Did Knuth try to make his code reusable? Was it reusable? I think those are the key questions.

gmfawcett · on Feb 27, 2020

That's not really a meaningful criticism. As others have pointed out, the point of Knuth's exercise was not to optimally solve the technical problem, but to demonstrate the effectiveness of Literate Programming (or the lack thereof). The technical problem was just a strawman, so that Knuth had a non-trivial program to demonstrate. With this in mind, McIlroy's pipe example isn't a critique of LP at all -- if anything, it was just a distracting advertisement for the Unix style of composing programs in the shell.

What McIlroy could have examined -- and chose not to at the time -- is whether awk, sed, tr, and friends could themselves be written in a literate style, and whether such a rewrite would have achieved the goals that Knuth was setting out for LP.

Knuth could have chosen to break his monolith into multiple, loosely-coupled programs, and then written then all in an LP style. But would that have really made the demonstration any more effective?

scythe · on Feb 28, 2020

> Knuth could have chosen to break his monolith into multiple, loosely-coupled programs, and then written then all in an LP style. But would that have really made the demonstration any more effective?

I would say yes. Clearly loose-coupling isn't necessary for a program that small. And no, it isn't always optimal.

But I have clear memories of being asked to re-do an intro CS assignment three times because it wasn't in a properly object-oriented style. Modularity is not a necessity all of the time, but it is important sometimes. Demonstrating the potential to write reusable code seems just as important as demonstrating anything else. (If the conclusion is "LP helps clarity, but you can't write libraries", is that even positive?)

As far as I can tell, "this code has only one useful point of entry" was a key part of the anti-LP argument leveled by McIlroy. After all, isn't the goal to demonstrate that LP works for people who aren't Donald Knuth?

gmfawcett · on Feb 28, 2020

I think McIlroy missed the mark here. Single points of entry, and tightly-coupled code, might be reasonable criticisms of Knuth's personal style, but I don't see them as inherent limitations of the LP approach. You could write multiple useful, interesting narratives about a library's core elements -- algorithms, data structures, etc., -- and then write a simple appendix documenting the entry points / API. The style itself doesn't have to get in the way of good program structure.

My own critique of LP is really more about the act of writing itself. Many people, programmers included, just aren't skilled at it! Knuth's literate programs are interesting because he's got something interesting to day, and his writing style is engaging. But I wouldn't enjoy having to read (or maintain!) a literate program that was written by a poor writer in a dull, meandering style.

Also, Knuth seems to think that the literate style ought to make us into better programmers, simply because we're writing prose along with the code -- that the combination somehow unlocks a better understanding of the problem, how to solve it, and how to explain it to others. That sounds inspiring, but I'm not sure it's really true in the general case. Perhaps more research is needed to find out. :)

btilly · on Feb 27, 2020

Did Knuth try to make his code reusable? Was it reusable? I think those are the key questions.

Not really.

Knuth focused on maintainability. McIlroy focused on reuse of existing code. These are very different, though both laudable, goals.

As an example consider the widely used program for mathematical typesetting, TeX. Knuth started work on it in 1978. Since 1988 it has only received bug fixes. Despite it being widely used, there have been no bug fixes needed since 2014. I cannot think of another program so widely used with such a good maintenance record.

However TeX itself reuses almost nothing.

No code that Doug McIlroy wrote has a maintenance record to match. But McIlroy was the original author of widely used programs including diff, echo, tr, and join. The combined works of Knuth are unlikely to have been reused in more ways by more people than diff alone.

Jtsummers · on Feb 27, 2020

> Did Knuth try to make his code reusable? Was it reusable? I think those are the key questions.

That wasn't the goal of the exercise. So how is that a key question?

The exercise was: Demonstrate WEB and literate programming with this particular problem. Knuth did that. The question, then, is whether the method demonstrated literate programming and WEB.

If you want to know whether the approach or tools work for creating reusable code and less monolothic code, then that question should be posed and a new exercise performed. The question shouldn't be posed to an exercise where that wasn't a concern, it's dishonest.

throwawa66 · on Feb 27, 2020

Did Knuth not use dependency injection? -10points! How about SOLID? -10points! No Framework? -30points! /s

GedByrne · on Feb 28, 2020

Take a look at the code. He makes use of both Gotos and global variables.

See paragraphs 5 & 7.

He would have been rejected as a job applicant immediately as soon as the paragraphs were read.

throwawa66 · on Feb 28, 2020

We have brainwashed ourselves with dogmatic theory. I see nothing wrong with the code. This code is not for business production. It works well for what it is. Un-brainwash yourself! It's great that Knuth doesn't have to apply for a job and go through the gauntlet because that would make Knuth un-Knuth!

jeffdavis · on Feb 28, 2020

There's a careful balance when using tools and libraries. It's obvious that they are a good choice sometimes, but I've been surprised at the number of times where a tool/library that looks like a perfect fit is actually not, and the whole problem needs to be reconsidered and I end up writing a lot of original code.

lotwxyz · on Feb 27, 2020

My Unix philosophy to showing people what a web-based 'ls' command looks like is this:

  $ import fs && import util && comstr --nowrap ls | pretty | less

(Works here: https://dev.lotw.xyz/shell.os)

DSpinellis · on Feb 27, 2020

This looks intriguing. Can you please explain what I'm looking at? I feel like Bowman waking up at the end of "2001: A Space Odyssey"

carlsborg · on Feb 28, 2020

The premise was that piping together shell commands was “better engineering” than a computer program that captured the authors thoughts? Which is more robust for debugging, producing diagnostics, error handling and reporting, extending, and code reuse?

Laughable and sad at the same time, because ACM would publish that.

j88439h84 · on Feb 28, 2020

> In fact, one of the reasons I sometimes prefer using Perl over Python is that it's very easy to incorporate into modular Unix tool pipelines. In contrast, Python encourages the creation of monoliths of the type McIlroy criticized.

Python's Mario tool makes it easy to use Python code in pipelines.

https://github.com/python-mario/mario

nixpulvis · on Feb 28, 2020

I find it comedic that someone called LaTeX error handling "phenomenally good".

svat · on Feb 28, 2020

It's my comment you're referring to (linked from the post). The full sentence was “BTW, TeX's error handling is phenomenally good IMO; the opposite of the situation with LaTeX.” You've reversed the meaning(!) but I stand by my original comment: I invite you to try plain TeX (instead of LaTeX) for a few weeks/months, and see how you feel about the way it handles errors.

Unlike LaTeX, where the (TeX) error messages usually appear arbitrary / incomprehensible / unrelated to what you're doing, in TeX (IMO) all the error messages are very informative and include a lot of information and give you ways to recover from your problem and poke around, get more context, etc. First you'll have to have read a manual (or I recommend A Beginner's Book of TeX by Seroul and Levy), but my claim is about the user experience in the steady state.

Of course, part of the reason is that LaTeX is much more complicated than the low-level things one may be doing with plain TeX. Another reason is that the LaTeX authors were working with severe constraints, one of which was of their own choosing: their choice of using TeX macros as a “programming language” (which it was never intended to be, and at which is it horrible). Nevertheless, a big part of the reason is that they were trying so hard to make things "easy" for the user in the typical case that they didn't care as much about ways in which things can go wrong and how surprising errors can be.

jimbokun · on Feb 27, 2020

Did Knuth ever reply personally on whether or not he felt he was framed?

blackandblue · on Feb 27, 2020

> In contrast, Python encourages the creation of monoliths of the type McIlroy criticized.

unfair and gratuitous criticism of python... i have seen and written many small tools you can run using "python -m".

btilly · on Feb 27, 2020

As someone who uses both languages extensively, I disagree.

You are right that Python is great for writing small tools that you can run, just like Perl.

But Python does not lend itself to writing them inline in a command line like was done here. Perl not only does, but has a number of useful features specifically added to fit this common use case. 3 of which were used in this example. (-a for autosplit, -M to load a module, and -e to have the code passed as an argument on the command line rather than having to have it saved to a file.)

Secondly, Perl lends itself to being used as a "better shell" while Python does not.

What I mean is that anything that can be written in bash can be trivially rewritten in Perl, and the program that you get tends to be substantially more maintainable if the bash script is at all complex. In such a rewrite there usually isn't a good reason to change the structure of the program and make it into a single Perl program.

By contrast Python has focused on the "One True Way" to do things, and the plumbing work for calling external commands is just verbose enough that a Python rewrite of a bash script is not necessarily better than the bash script. And furthermore it is much more likely that the Python rewrite of the bash script is much better rewritten as a Python script.

The result is that for someone who lives on the Unix command line, Perl integrates into their world better than Python does. If you have never lived on the Unix command line, the objections may sound silly. But spend months typing commands and doing the extra steps that Python requires Every Single Time will get old.

(This is historically not surprising. Perl 1 was focused on generating text reports. Perl 2 moved into being a sysadmin tool. Perl wound up as a web language because it is what all of the sysadmins recommended for text manipulation to people writing early CGI scripts.)

ben509 · on Feb 27, 2020

Perl's big win for one-liners is braces syntax. Interestingly, there are already projects to add braces syntax[1].

Two of the three Perl features are also Python features, namely -m to run a module and -c to run code on the command line.

Regarding Perl being a better shell, there are modules like `doit` and `invoke` that make Python far better than perl for managing jobs, precisely because they make forking off jobs super easy.

But now that you mention it... I want to write a module to make python one-liners easy.

[1]: https://pypi.org/project/brackets/

gerikson · on Feb 27, 2020

Can you show an example of using doit and invoke to fork off jobs in Python? I’m pretty sure Perl has had similar functionality for some time.

ben509 · on Feb 28, 2020

Yeah, perl borrows backticks from bash[3], so it's giving you syntax to do it directly, and it's long had strong support for opening a process using a very intuitive syntax.

Python's subprocess module works quite well, but gets extremely verbose[4] as you try to do anything more complex than "run a command and get the output" and has some nasty gotchas[2].

I forget the invoke syntax, but doit[1] is basically a make replacement so calling the shell is pretty easy:

    def task_something():
        return {'actions': ['ls foo/', 'rm -r foo']}

And you can use outputs from one task as inputs for another, it tracks what's been done, etc.

[1]: https://pydoit.org/

[2]: https://docs.python.org/3/library/subprocess.html#subprocess...

[3]: https://perldoc.perl.org/5.30.0/perlop.html#qx%2f_STRING_%2f

[4]: https://docs.python.org/3/library/subprocess.html#popen-cons...

gerikson · on Feb 28, 2020

How can one do something like fork/exec in python? That's what I was thinking of when you mentioned "forking off jobs".

There are a number of different ways to launch an external process from Perl, I think this StackOverflow answer summarizes them quite well:

https://stackoverflow.com/a/800105

I'm an experienced Perl user, but I'm not as familiar with Python. In addition, I'm not really using Perl for sysadmin stuff, so I tend to try to keep stuff "within" Perl. As an example, I'd rather use the File::Find module than use backticks to invoke `find`. This has really nothing to do with functionality - I'm almost always on Linux, and the syntaxes are similarly hairy - it's just that usually you get more powerful functionality using the Perl functionality.

(edit rearranged paragraphs)

codetrotter · on Feb 27, 2020

I use a few different languages, one of which is Python, and I use the command line a lot, and I agree that Python is too verbose for a lot of the things that I do on the command line. Therefore, Python is not something that I reach for when doing simple tasks involving pipelines and/or file operations.

I have not yet put time into learning Perl. In no small part because I was intimidated by the weirdness of some of the Perl code that I've seen. The terseness that Perl allows, and which I desire, is at once compelling and scary at the same time. For this Perl also has earned the reputation that it "Write Once, Read Never".

But let's assume that I overcome my fear of Perl. Which version of Perl would you recommend that I learn? Perl 5 or the language formerly known as Perl 6?

totalperspectiv · on Feb 27, 2020

Perl5. Some version of perl5 is available by default on just about everything and can be counted on in a similar manner as counting on awk to be there for you.

I'm not going to defend Perl's readability, there are many opus's online in both directions. Suffice it to say that Perl is still a really good tool for certain $jobs.

For learning Perl, I used this: https://qntm.org/perl_en followed by some trial and error, followed by the book Modern Perl, then Higher Order Perl. glhf! Perl hacking is a blast

btilly · on Feb 27, 2020

Learn Perl 5. Perl 6 is an interesting research project for future directions that it doesn't look like the programming world will go.

Most of the weirdness of Perl goes away when you read $ as "the", @ as "these" and a hash lookup as "of".

perigrin · on Feb 27, 2020

That's not exactly fair to Raku. A more fair critique (and keeping with the theme of this thread) is that Raku is less focused on integrating with the Unix command line than it is on tool building putting it closer to Python than Perl(5) in the spectrum of things. This was a specific design influence dating back to the first days of Perl 6, so it makes some sense.

btilly · on Feb 27, 2020

That's not exactly fair to Raku.

I know that Raku supporters disagree with me, but that has been my considered opinion for several years. And this has been something I put a lot of thought into.

Let me lay out the case.

What are the key ideas invented or promoted in Perl 6 / Raku that people get excited about?

- Object-oriented programming including generics, roles and multiple dispatch

- Functional programming primitives, lazy and eager list evaluation, junctions, autothreading and hyperoperators (vector operators)

- Parallelism, concurrency, and asynchrony including multi-core support

- Definable grammars for pattern matching and generalized string processing

- Optional and gradual typing

I got this list from https://www.raku.org/. It is what Raku people think is interesting about their own language. (So I don't get to bring up things I really don't like, like twigils.)

Some of these ideas are mainstream, some not. According to Tiobe (yes, not to be taken seriously but it is accurate enough), the top languages today are Java, C, Python, C++ and C#. Let's eliminate from the list of Raku features anything that is supported by at least 2 of them to come up with things that are novel in Raku while not being broadly adopted today. The list gets much shorter.

- Roles (OO programming)

- Junctions, autothreading and hyperoperators (functional programming).

- Definable grammars for pattern matching and generalized string processing

- Optional and gradual typing

How many of these will be widely adopted by top languages in 25 years? My best estimate is 1. Could be none, could be 2, I'd be shocked if there were 3.

I say opinion, but it is a fairly well educated opinion. Here is my argument about each.

- Roles. They have been around for some years. The only language where I have seen them used heavily is Perl 5. Nobody else seems excited.

- Junctions are mostly a bit of syntax around any/all which is pretty convenient already. Autothreading and hyperoperators are a cool sounding way to parallelize stuff, but getting good parallel performance is complex and counterintuitive. I don't think that this is a good approach.

- Definable grammars are an interesting rethinking of regexes, but parsing is a difficult and specialized problem. I don't see an interesting approach in an unpopular language changing how the world tackles it.

- Optional and gradual typing sounded great when it made it into the Common Lisp standard. But over 30 years later, only Python supports it of the top 5. And it isn't widely used there. I see nothing about the next 25 years that makes it more compelling than in the last 25. (Though Raku's implementation is far, far better than Perl 5's broken prototype system. But that is damning with faint praise.)

So use Raku if you find it fun. You'll get a view into an alternate universe of might have beens. But I still believe that the ideas that are new to you won't be particularly relevant to the future of programming.

-----

It is hard at this date to make what a similar list would have been for Perl 5 at a similar stage. People were excited about CPAN. Perl people kind of took TAP unit testing for granted and didn't appreciate exactly how important it was. Perl people liked the improvements in regular expressions but probably couldn't have guessed how influential "perl compatible regular expressions" would become across languages. Ideas we were excited about like "taint mode" went approximately nowhere. And some ideas that Perl helped popularize, like closures, were ones that few Perl programmers realized were actually supported by the language.

However it would be a true shocker if Raku was anywhere near as influential on the programming landscape 25 years from now.

zimpenfish · on Feb 28, 2020

> Junctions are mostly a bit of syntax around any/all

A quick look at Raku junctions makes me think they're basically a slightly tarted-up version of Icon's generators and goal-directed execution (which is no bad thing, of course but hardly novel.)

lizmat · on Feb 28, 2020

Then you definitely did not grok it. What I gather from https://en.wikipedia.org/wiki/Icon_(programming_language)#Go... is that Icon's goal directed execution is more like `react whenever` in Raku (https://docs.raku.org/language/concurrency#whenever)

Junctions autothread. What does that mean? Using a Junction as an ordinary value, will cause execution for each of the eigenstates, and result in another Junction with the result of each of the executions. An example:

    # a simple sub showing its arg, returning twice the value
    sub foo($bar) { say $bar; 2 * $bar }

    # a simple "or" junction
    my $a = 1 | 2 | 3;
    say foo($a);  # executes for each of eigenstates
    # 1
    # 2
    # 3
    # any(2, 4, 6)

Documentation: https://docs.raku.org/type/Junction

b2gills · on March 3, 2020

Multi-threading and Junctions auto-threading are NOT the same thing.

Calling it auto-threading has lead many people to the wrong conclusion.

(It is possible that auto-threading may be done multi-threaded in the future, but it doesn't do it currently.)

b2gills · on March 3, 2020

Roles take the place of all of these:

- Java interfaces

- abstract classes

- C++ templates

- Python mixins

- Smalltalk traits

So are you telling me that you haven't used any of those?

---

Roles combine all of those features very simply.

    role Interface {
        method hello-world ( --> Str ) {...}
        # the ... means it needs to be implemented by consuming class
    }

    role Abstract {
        has Str $.name is required;
        # adds an accessor method of the same name

        method greet ( --> Str ) {...}
    }

    role Template[ Real ::Type, Type \initial-value ] {
        has Type $!value = initial-value;

        method get ( --> Type ) {
            $!value
        }
        method set ( Type \new-value ) {
            $!value = new-value
        }
    }


    my $value = 42 but anon role Mixin {
        method Str ( --> 'Life, the Universe and Everything' ){
        }
    }

Roles were heavily influenced by Smalltalk traits. Rather than being limited to those uses, Roles were expanded to include all of those other use-cases as well.

---

Really Roles are a better method of code-reuse than inheritance.

    role Animal {
        method species      ( --> Str  ){...}
        method produces-egg ( --> Bool ){...}
    }

    role Mammal does Animal {
        method produces-egg ( --> False ){
            # most mammals do not produce eggs.
        }
    }

    role Can-Fly {
        method flap-wings ( --> 'flap flap' ){
        }
    }



    class Bat does Mammal does Can-Fly {
        method species ( --> 'Bat' ){
        }
    }

    class Bird does Animal does Can-Fly {
        method species ( --> 'Bird' ){
        }

        method produces-egg ( --> True ) {
        }
    }

    class Platypus does Mammal {
        method species ( --> 'Platypus' ){
        }

        method produces-egg ( --> True ) {
            # override Mammal.produces-egg()
        }
    }

Of course a simple example doesn't do this ability justice. It really shines on large code-bases.

For a better example see: “Curtis Poe (Ovid) - Roles versus Inheritance” https://www.youtube.com/watch?v=cjoWu4eq1Tw

That is of course about roles in Perl, which doesn't have all the same features. All of the points do apply to Raku roles though.

---

Raku has so many good ideas it would be a waste if other languages didn't copy at least some of them. I of course can understand if a single language doesn't want to copy all of them at the same time.

It would definitely be a waste if no other language tries to combine regular expressions, parsers, and objects like Raku grammars have done.

At the very least Raku regular expressions are easier to understand than Perl compatible regular expressions. (Note that I very much DO understand PCRE syntax, having used it heavily in Perl for many years.)

Compare the Raku grammar for JSON https://modules.raku.org/dist/JSON::Tiny:cpan:MORITZ/lib/JSO... to the Perl version https://metacpan.org/release/JSON-Decode-Regexp/source/lib/J...

I would like to point out that when JSON started allowing any value as a top-level result, it actually made the Raku code simpler. https://github.com/moritz/json/commit/9888329771730574967197...

lizmat · on Feb 27, 2020

Perl 6 was indeed an interesting research project. Raku is much more than that.

BTW, now that Perl 6 has been renamed to Raku, you can drop the "5" from "Perl 5", and just call it "Perl". Saves on typing!

Re: learning Perl: by all means, but don't let that stop you from learning Raku as well.

lizmat · on Feb 27, 2020

Depends on what you want to achieve: do you want to be where the puck has been, or where the puck will be? Both approaches have their pros and cons.

btilly · on Feb 27, 2020

I am willing to bet you $1000 that in 2030 there are more jobs on general $job_board of your choice that mention Perl than Raku.

We can say 2040 if you think that 2030 may be too soon for Raku to have any chance at all. But there is a good chance that one of us will be dead by then. (I'll be 70, I think you'll be in your 80s.)

My point being that if you yourself are not confident enough to take some bet of that form, you cannot expect people to take you seriously when you describe Raku as "where the puck will be". Particularly not people who are happy to explain why they think that Raku won't do that, and are willing to make bets of that kind.

lizmat · on Feb 28, 2020

> I am willing to bet you $1000 that in 2030 there are more jobs on general $job_board of your choice that mention Perl than Raku.

Bet taken. With the current rate of Perl's decline, I think that's a very safe bet to take.

> I'll be 70, I think you'll be in your 80s.

I know I'm an old hack, but not that old: I'll be 73 for 98% of 2030.

btilly · on Feb 28, 2020

Bet taken. With the current rate of Perl's decline, I think that's a very safe bet to take.

Awesome. I know how hard it is to get rid of legacy code, and there are a lot more startups that I'm aware of starting with Perl today than starting with Raku.

I know I'm an old hack, but not that old: I'll be 73 for 98% of 2030.

The "then" that I was referring to in that sentence was 20 years from now. Which is 2040. As I said, you'll be in your 80s at that point.

lizmat · on Feb 28, 2020

The bet is for 2030. Let's be clear about that then :-)

setpatchaddress · on Feb 27, 2020

Ruby is the best version of Perl. You can do command line one-liners and full-on object-oriented (SmallTalk object model) readable, maintainable programs.

btilly · on Feb 27, 2020

My impression 20 years ago was that Ruby is an interesting mix between Perl and Python. In principle there is little that differentiates Perl and Ruby in terms of how maintainable or not their code can be.

However the Ruby ecosystem wound up with a lot of modules contributed by people who had just moved to it from languages like Java. They overreacted to their new found freedom. The result is that between a poor testing story and questionable practices like "monkeypatching" (literally modules overwriting random methods in other modules) the Ruby ecosystem wound up with a lot of nasty gotchas. (There is a lot of, "f you load module A then B, it works, but module B then A and it doesn't.")

Yes, Ruby programmers get up in arms when you say that they have a poor testing story. But ask them whether by default they have actually run unit tests for everything installed on their system, and they have not. Ask them if they could run unit tests and they think they can. But those who I have watched try have found out the hard way how many unit tests were only written for the original author to run in the original environment, and can't easily be run in an automated way. By contrast the default for CPAN is that every module has had its unit tests run on every system it is installed in, and automated smoke tests ensure that modules have had their tests run on a wide variety of operating systems and versions of Perl.

The result is that random Ruby module X is generally less likely to be dependable than random Perl module Y. Which in turn means that in my experience significant Ruby code bases written by competent programmers top out at smaller than Perl, with worse maintenance stories.

That doesn't discount the fact that there has been a tremendous amount of unmaintainable Perl written by incompetent programmers. (Particularly during the wild dot com days.) But "maintainable" is NOT something that Ruby has a good story to tell about.

jimbokun · on Feb 27, 2020

How does Ruby compare to Perl and Python for these tasks?

btilly · on Feb 27, 2020

Ruby is about equal to Perl as a language for interactive command line usage, and both are better than Python.

Comparing RubyGems to CPAN, CPAN is about 2x as large, has a better infrastructure, better testing, and is generally better.

Comparing CPAN to PyPI (the Python version), they are about the same size, PyPI has a worse testing story, has more up to date modules, is growing faster and seems to be of similar quality. If you want write a system that integrates with a recent standard, support from Google, or to use something like machine learning, Python is the clear winner.

Adding JavaScript, I consider node.js to have the worst command line story, worst repository system, but it is extremely popular.

I personally use Perl for command line stuff and Python otherwise. I use JavaScript when I have to (and sadly I have to a lot). It is rare for me to bother with Ruby. But I learned Perl first, and have written more in Perl than the others combined.

Does this answer your question fully?

ptx · on Feb 27, 2020

Perl starts up really, really fast though, so invoking it repeatedly in a pipeline is more practical than with a Python program for that reason.