Hacker News new | past | comments | ask | show | jobs | submit login
Aaron Swartz's Thoughts on djb (2009) (aaronsw.com)
271 points by cookrn on March 6, 2015 | hide | past | favorite | 137 comments



I've spent significant time reading DJB's source, particularly daemontools. It makes me sad that programmers in this decade don't seem to be taking that much influence from him.

Commercial software is hopeless in terms of bloat and security holes. Open source software is sadly not that much better.

The whole motivation for djbdns was all the glaring holes in BIND, and I think the same is true for qmail and sendmail. Yet basically every institution in the world is running a pile of sloppy software that dumps our private data to hackers on command.

There is a lot to learn from djb. He's about 10-20 years ahead of his time, and you have to read code to absorb the wisdom.

Here is some text that my help:

http://thedjbway.b0llix.net/

http://lwn.net/Articles/257004/


For anyone else who didn't know, he wrote this Daemon Tools:

http://en.wikipedia.org/wiki/Daemontools

not this Daemon Tools:

http://www.daemon-tools.cc/


> It makes me sad that programmers in this decade don't seem to be taking that much influence from him.

For a while, his stuff had weird licensing:

http://en.wikipedia.org/wiki/Qmail#Copyright_status


Yeah, it was confusing enough that qmail was dropped from the OpenBSD ports tree.


The Debian packages were scripts that downloaded, compiled, repackaged, and installed qmail for you, which was a giant mess but the only real way to make it work (license-wise).


DJB is contributing to a security research OS called Ethos, https://www.ethos-os.org/

Papers: https://www.ethos-os.org/papers.html


I found his postmortem of qmail interesting reading: http://cr.yp.to/qmail/qmailsec-20071101.pdf


Interesting to read that djb is in favor of automatic exception handling rather than explicitly checking return codes:

"Fortunately, programming languages can—and in some cases do—offer more powerful exception-handling facilities, aborting clearly defined subprograms and in some cases automatically handling error reports. In those languages I would be able to write

    stralloc_cats(&dtline,"\n")
or simply

    dtline += "\n"
without going to extra effort to check for errors. The reduced code volume would eliminate bugs; for example, the bug “if ipme_init() returned -1, qmail-remote would continue” (fixed in qmail 0.92) would not have had a chance to occur."

Contrast this with the explicit error handling promoted by Go (and used in djb's software due to lack of exception handling in C).


That's the philosophy in Python. Instead of checking if you can do something ahead of time, just do it. If it fails, handle it then.

This results in significantly cleaner code, since you generally assume that the code you're writing is going to run correctly (e.g. that you can successfully create a directory in /tmp/ or a pidfile in /var/run).

Python, however, is also dynamically typed, which means that a lot of code is going to run even if you get unexpected input (e.g. 'print "this variable is %s" % somevar' is going to do a reasonable thing for pretty much every possible value of somevar), but it also means that your code might do something with invalid input which only makes sense if you know that your input was invalid (e.g. 'y * 10' is 20 if y is 2, or "2222222222" if y is "2"). C doesn't have that flexibility/curse, so it wasn't really an option for DJB.


"Well defined subprograms" is key here -- in a mutable language like Go, goroutines are not well-defined subprograms, so exceptions are tricky to use correctly.

Edit: I should clarify that's just speculation/personal opinion on my part. I'm not sure what djb had in mind.


We're probably in some kind of security dark age right now where our tools just aren't good enough to automatically fix things and from a political/business/market pov, we don't have the incentives to write truly secure software.

I imagine things like Rust are going to help in the near future. Just handling all these buffer overflows is going to make a fairly big difference in computer security. I also imagine things like IPS's are going to trickle down to client devices like laptops and phones. We have enough extra CPU power to check every packet for known exploits and pro-actively stop things like XSS or SQL injections.

We are also probably looking at the dawn of the locked down desktop that can only access installable software from the OS store which will deliver only signed and vetted software. Our largest exploit right now is just being able to send someone a hyperlink to trojan.exe and having them run it. End users will always fall for that unless the OS stops them.

A higher level of app security and app behavior nannying is probably coming our way. If you have a Windows machine you can install EMET right now. There are rumors EMET's functionality will be baked into Win10.

This period reminds of the pre-Nader peroid in American automotive history. Manufacturers competed on HP, large sizes, etc and not on safety, efficiency, etc. We're asking for the wrong things. Now that things like low power chipsets, SSDs, and multi-cores are the norm, its time for the industry to focus on security. It'll get there -- dragged, kicking and screaming eventually. Especially in the age of bitcoin which makes Cryptolocker ransomware payments anonymous. More code reviews and more forks and clones aren't going to win this. There needs to be a holistic change in how we create and deploy software and this change needs to be in demand. The real question is, are we incentivized to make this change yet? Why aren't we asking for better security?


I can't find them again, but I remember seeing design diagrams from djb's qmail. His approach to system design is pretty great IMHO, it's pragmatic, minimalist, language/paradigm agnostic.


Looking at his code, it's kind of scary. No comments. K&R C style declarations. Vast amounts of pointer arithmetic. What makes this work is that he defines a generic collection class.[1] This being C, it's a macro which generates a struct:

    #define GEN_ALLOC_typedef(ta,type,field,len,a) \
     typedef struct ta { type *field; unsigned int len;\
     unsigned int a; } ta;
This is equivalent to <vector> from C++. Then he defines strings based on this (see stralloc.h), and provides the usual operations on them. Disciplined use of those primitives provides good reliability for all the string handling a mail handler does.

[1] https://github.com/amery/qmail/blob/master/gen_alloc.h


> Looking at his code, it's kind of scary. No comments.

Did you ckeck out addresses.5, for example? The source may not be commented, but if he clearly describes his assumptions (he does), source comments are less necessary.

> K&R C style declarations.

I don't understand what bearing this has on code quality. Do you contend that it leads to bugs?


And the code had to be compiled with the compilers that even in nineties were "old".


I don't think you are right to say "had to be compiled with". It could be compiled with SunOS cc, and the c compilers of other vendors that were crappy and out-of-date. It could also be compiled on the current gcc/egcs etc. that were out. I think it had issues with e.g. the solaris cc because it was a pretty broken and barely used compiler to begin with.


Just glancing through the qmail GitHub repository gave me a headache. Plenty of files have 0 comments (not really sure where you see the clearly stated assumptions). In addition, he seems obsessed with 1 letter variables which offer no inclination of their purpose at a glance.


Here's something I did for myself a long time ago: take some of the fundamental code like alloc.c, alloc_re.c, fmt_str.c and str_*.c, and start commenting it yourself. I found that it was so expressive and concise that after a short time the code is very clear and the comments get in the way.

The only real noise is that no-one uses SunOS cc anymore and compilers will optimize loops (expanding in place, or doing other magic optimizations when appropriate) so loop-unrolling doesn't apparently buy an increase in performance anymore.


And yet he probably wrote code that would be considered objectively better than anything I have ever seen from my large variable name and lots of comments colleagues. Perhaps there is more to it than surface level concerns?


Have you ever worked in a large C code base? None of what you describe is shocking or bad.

This is not a defense of the code as I have not had a chance to look at it. It seems like Swartz was a bit over the top in his description of its greatness and I am naturally skeptical of such idolatry.


So basically, Pascal strings? :)


No, basically, generics, like the STL in C++, but in plain C using just macros.


I disagree with Aaron. I would reserve the title of the greatest programmer in the world to Fabrice Bellard. He single-handedly wrote QEMU, FFMPEG, an LTE base station, a PC emulator in javascript, and countless other projects. Alone.

http://bellard.org/


djb aims to write bug-free programs and actively works on improving the process to do so. While Bellard is a great programmer, I don't think Bellard aims to write bug-free programs.

See section 8 of http://cr.yp.to/cv/activities-20050107.pdf

There are more people who aims to write bug-free programs, but most of them seem to use formal verification. djb seems pretty unique in aiming bug-free programs in normal programming and getting close.

Re: Bellard software quality. Fuzzing found 1120 bugs in FFmpeg. While Bellard's code is only small part of entire FFmpeg, FFmpeg is very far from being bug-free.

http://googleonlinesecurity.blogspot.com/2014/01/ffmpeg-and-...


I was shocked (and impressed) when I looked at some of djb's code and saw that he put his "the return value of every syscall must be checked" view into practice so consistently that every single printf() (not only fprintf()!) was inside an if block that checked whether it succeeded in writing to stdout or not.

Of course, printf() does have a return value ("Upon successful return, these functions return the number of characters printed"), and it can fail... but it takes a considerable consistency of purpose to decide to deal with that possibility explicitly every single time.


I think this says as much about djb's abilities as about C's limitations.


out of curiosity, what do you (does he) do if printf fails?


From qmail/qmail-pw2u.c (linked somewhere above):

    void die_write()
    {
      substdio_putsflush(subfderr,"qmail-pw2u: fatal: unable to write output\n");
      _exit(111);
    }

    /* ... */
    if (substdio_puts(subfdout,uugh) == -1) die_write();
    if (substdio_puts(subfdout,dashcolon) == -1) die_write();
    if (substdio_put(subfdout,x,i) == -1) die_write();
    /* ... */


So what does he do when substudio_putsflush has an error?


Well, the process will still exit unconditionally immediately afterwards, attempting to signal an error via its return code. It's hard to do better than that in the context of a real operating system.


> See section 8 of http://cr.yp.to/cv/activities-20050107.pdf

Last sentence in the paper:

"I won't be satisfied until I've put the entire security industry out of work."

That's nothing short of amazing.


Another one from Section 8

> Bug-elimination research, like other user-interface research, is highly nonmathematical. The goal is to have users, in this case programmers, make as few mistakes as possible in achieving their desired effects. We don’t have any way to model this—to model human psychology—except by experiment. We can’t even recognize mistakes without a human’s help. (If you can write a program to recognize a class of mistakes, great—we’ll incorporate your program into the user interface, eliminating those mistakes—but we still won’t be able to recognize the remaining mistakes.) I’ve seen many mathematicians bothered by this lack of formalization; they ask nonsensical questions like “How can you prove that you don’t have any bugs?” So I sneak out of the department, take off my mathematician’s hat, and continue making progress towards the goal.


It's really quite different. Bellard is inspiration, he creates paths to new and interesting things. DJB's avowed goal is to put the security industry out of work by not having bugs through rigorous practices.

Bellard is what you use on your laptop to have fun, DJB is what you run in your bank to make the world go round.


While DJB may be more "bank" oriented, plenty of people have been making the world go round using Qemu and FFMPeg. I'd argue that Youtube (if I recall correctly it was and possibly is one of the major users of FFMPeg) has had just a big of an impact as online banking.


Well, FFMPeg makes the world go round, including for hackers: http://thenextweb.com/google/2014/01/10/google-says-helped-f...


I admire Fabrice Bellard. I find amazing the amount of software that he has produced. Some of his creations are remarkable being one-man efforts eg. a compiler, an emacs clone, an emulator, etc.

I spent quite some months learning about editors and one of my evening distractions was to write a new UI backend for qemacs. I found the design clever, quite flexible, etc, but in no-way easy to read or elegant. (To be fair, it did not help that he added a video player and a WYSIWYG html editor to it).

Later I came to a Vim clone called vis (https://github.com/martanne/vis) and I found the source code more close to those programs where you think "It can't be made simpler".

I wish future programmers read more code before writing just like literature writers do. It should be part of the education system and I regret not having read "classics" before starting writing.


I always had the feeling the djb's software is going to be interesting but never going to be widely adopted long term. I think this feeling turned out to be right, since I rarely come across environments running anything from djb, especially lately, and it's hard to spend a day without touching a few things involving code from Bellard.


I've spent quite a bit of time trying to do things with tcc, and - ugh, no.


tcc did start life as an Obfuscated C Contest entrant. It shows.

(It won, of course. http://bellard.org/otcc/)


Not single handed, look at how many contributors most of those projects have...

Fabrice Bellard is smart, though his code is hard to read imo.


Wow three letters, so many stories in my head.

Here are my top 5 reasons why I think qmail + daemontools + djbdns is great :

    1. bugs free code ( well almost ) ;
    2. fast & memory efficient ;
    3. following "everything is a file" philosophy ;
    4. easy to configure and more
I've been using this suite for almost 10 years now. I've never found any alternative. BIND was too complicated in terms of configuration for me and also got updates almost every 2 weeks ( 8 years back it was much much harder than doing "apt-get ..." )

Qmail is also flawless piece of software. It basically taught me how you should separate processes and how to use linux accounts properly for the daemons. Amazing.

Finally the only thing I rely on right now is daemontools. I'm using it for all of my production nodejs websites and even for a tool that I call "deployer" which automatically stops, updates and starts a daemon from the suite.


Sorry, 8 years ago it was 2007 and updating BIND was done precisely by doing 'apt-get ...'. I still have a server running same installation of Debian since 2005 (copied from one HDD to another and to a virtual server then), and it's been apt-get updated all the way through.


I know you're having fun showing off how easy it was to update BIND, but how many times have you had to update it since that time? Now how many times did he have to update the DJB name server?


I don't know, I'm not counting them. And there is no doubt djbdns is better than BIND regarding security - it's just you can't use "8 years ago updating stuff was hard" as an argument, because it was not.


8 years ago I was using slackware as my main server distribution. Yes there is no "apt-get ..." in slackware even nowadays, but back then debian switched to aptitude at around 2005 ( sarge ) [1].

I remember around 2010 I reached uptime 3 years on one of the machines, with my good old patched slackware 10. Now imagine me switching to "apt-get ..." because of "easy security updates" for a tool that lived less than my uptime was.

Yes this is the time that I had to use exactly this argument, but against doing precisely that.

1: https://www.debian.org/doc/manuals/project-history/ch-detail...


Qmail is far from flawless. It could be better if patches were allowed, but no, that's forbidden.

As much as DJB is a talented programmer, his attitude towards outside contributions comes across as extremely hostile.

I can't take any of the DJB projects seriously because they're always in a state of permanent abandonment and maintainers are impossible to find. They serve as art pieces more than practical, useful software.

He's certainly from the old school of development where tight control over your software ensures it's not exposed to risk from external developers, and of course he's entitled to do that. The consequence is that severely limits how this software can be used.


Why should software be in a state of permanent development?

"I can't take any of the DJB projects seriously because they're always in a state of permanent abandonment and maintainers are impossible to find. They serve as art pieces more than practical, useful software."

There are plenty of "maintainers", just search for the qmail mailing list, if that is impossible then I'm not sure what to say to you, but what you say is bunk.


Software needs to be in a state of permanent development because nothing is static. Operating systems change. Usage patterns change.

Seeing Qmail rot on the vine is heart-breaking, but that's exactly what happened.

I used to use Qmail exclusively, but over time it got harder and harder to install an up-to-date version of it, requiring more and more "unofficial" patches to get it actually working. After too much of that I gave up and went with Postfix.

I don't want "mailing lists", I don't want Bugzilla, I don't want SourceForge. I want people that respond to something like GitHub issues where the interaction is painless and productive.


qmail has been in the public domain since 2007, and patches have always been allowed.

Most of your other statements seem similarly at odds with reality to me, but they’re more statements of your taste than anything else.


It's neither fast/memory efficient nor easy to configure. Back when we created our email infrastructure (10yrs ago) we deeply evaluated qmail and postfix. Postfix beats qmail on both counts by a huge margin.

The rest of the qualities do stand. We kept tinydns from that evaluation, and gladly so. It's a marvelous piece of software, never breaks on its own, requires near zero maintenance.


I largely agree with this, but I also think it's worth noting that DJB opted out of writing most functionality, leaving it to the end user, so he has provided us with a fantastic bike frame upon which we bolt far less superior software than the alternatives.

You want AXFR with djbdns? Well, DJB decided that AXFR is stupid, and that you should live in a monoculture of only DJB software, which doesn't have to conform to standards, so you have to write scripts to handle this at both ends and AXFR is one of the BIGGEST security concerns in DNS.

That said, I've really enjoyed running qmail, dnscache, and daemontools. These days I use runit, simply because it is maintained, because I have trouble buying into the notion that any software can be suitable across platforms and changing underlying libraries. I have no doubt that runit's code is less stringent than DJB's, and I find it fruastrating that a couple of things I used to do with daemontools cannot be done with runit.

Anyway, always good to ressurect Aaron's ideas, that DJB outlasted him is a fucking shame.


This is definitely true. qmail didn't have bugs in part because it implemented a barebones SMTP server, and had a lot of ridiculously onerous conditions under which it ran. For example, files in the queue were named after their inodes, meaning you couldn't just restore from backup and have it work. UID/GID were compiled in statically so you couldn't pre-build binaries and distribute them unless you had centralized user management via NIS/LDAP.

There were no features and no way to extend functionality, meaning that as the internet's use of email changed (e.g. SPF records), everything was distributed as a patch to qmail. This meant 1. sysadmins got to spend more and more time applying patches that hadn't been tested together, struggling to even get them to apply or compile; 2. sysadmins got to debug a lot of other people's code because there wasn't anyone to report bugs to; 3. if you accidentally blew away your qmail build directory there wasn't much chance of you getting an identical configuration back again, making your entire mail system ridiculously fragile.

When qmail was first released, it was awesome and amazing. Only a few years later, Postfix started providing the vast majority of the security benefits of qmail (e.g. separating privileges and functionality into separate daemons), with nowhere near the number of headaches. Need to change a configuration parameter? Just use postconf(1) instead of recompiling. Need to replicate your mail configuration? Just copy the configs over. Need to add new functionality? Milters.

I worked at a hosting company a few years ago that still used qmail (and Apache 1.3, and this was in 2010); there was a slight misbehaviour in qmail which we needed to change, which resulted in one of our sysadmins (who didn't really know C) spending days reading, changing, compiling, testing, and debugging code which, with Postfix, would have been a one-line config change. And who knows if it's robust? He stopped working on it once he had a solution that passed his test without segfaulting immediately.

Qmail did wonders for the internet by replacing sendmail, but horrors for the internet by replacing necessary functionality with onerous security, requiring third-party patches for almost everything other than just exchanging mail between servers, and refusing to update the code to add anything that wasn't there already.


"I worked at a hosting company a few years ago that still used qmail (and Apache 1.3, and this was in 2010); there was a slight misbehaviour in qmail which we needed to change, which resulted in one of our sysadmins (who didn't really know C) spending days reading, changing, compiling, testing, and debugging code which, with Postfix, would have been a one-line config change. And who knows if it's robust? He stopped working on it once he had a solution that passed his test without segfaulting immediately."

Anecdotal, please provide a link to the qmail mailing list with evidence.


I have vague memories of the fact that qmail was mostly used with patches that DJB didn't want to incorporate, (e.g. support for STARTTLS).

This ended up causing a lot of people to use other servers which _did_ respond to user demands.

So maybe other than only learning from the way he writes code, we can also get some ideas on how to nurture open source projects.


DJB wasn't exactly wrong about TLS, it's been an unending source of security holes for a while now and everybody that's looking at the code expects more. Most of his work is actually dealing with encryption and pointing out flaws in implementations like DNSSEC (which provide no security to the end client).


There's also the licensing, preventing distros from using their own directory layouts. djb says they're being ridiculous and his folder layout is superior (and it is). But it seems to have the opposite effect if it just means packagers decide to simply drop the software :(


The software you're talking about is in the Public Domain.


Only since December 2007:

http://en.wikipedia.org/wiki/Djbdns#Copyright_status

http://en.wikipedia.org/wiki/Qmail#Copyright_status

The change was made because the situation was unfolding exactly as described above.


I've followed a policy of using djb's software whenever possible over the years. Unfortunately qmail shows its age and may no longer be a good choice these days, but then again, who runs their own email server? :)

daemontools and ucspi-tcp are still some of the best tools to dig into. I love multilog (part of daemontools) which solves problems I didn't even know exist. e.g. I had a pool of servers and wanted to collect all the logs in one place... the way multilog names files makes this a simple rsync task, and you can just concatenate and sort, it even works for multi-lined output.

Since I started using docker, djb's tools got a new life for me. I manage all services within docker containers with daemontools.


Qmail may show its age, but the overall structure is sound, and one the beautiful parts of qmail was/is how it isolated everything in their own processes intercommunicating via very simple protocols over file descriptors. So you could build far more complicated mail systems by starting with qmail and replacing parts as you went.

A company I co-founded used qmail for delivery for a webmail service for ~2m users, and while we used the original qmail less and less, it was because we were able to use qmail as scaffolding as described above.

And we ran lots of services under daemontools.

Later we used qmail as a generic message queueing system for a turnkey registrar platform for .name...


Are you able to expand on how qmail was used / not used? I didn't think it was something that could be easily modified like that


In which context?

For the webmail provider, Qmail was used for all inbound/outgoing e-mail. Some modifications we made was to replace the delivery process with one that looked up the mail storage backend a given user was on and passed on the mail to qmails normal delivery process.

We also modified the delivery process to embed additional information in the message file names so that we could get away with just reading the directory to get file size (instead of additional stat() calls for each file), flag status etc.

Eventually we added quota checking and a cache of parsed header information (coupled with a custom command added to the POP server to list that info).

Qmail was ideal for that given that it consists of a bunch of small, easily understood components that all are documented extremely well and can be tested on the command line.

For the queueing system, any mail system works fine if your requirements does not include absolute ordering. All we needed to do was poll a Maildir (or POP server) to handle incoming messages, or inject outbound e-mail for outbound messages. It was fast and simple at a time when there was a distinct lack of open source dedicated queueing implementations.


Unfortunately qmail shows its age and may no longer be a good choice these days

Last I checked, Yahoo (Mail) uses qmail. Can anyone verify whether that is still true today?


Yahoo advertises TLS support, so the bulk of the code Yahoo uses will be third-party extensions, perhaps openssl.


>who runs their own email server? :)

Anyone who cares about privacy? :)

Qmail is too old to be usable these days, but Postfix does at least use its security model. It isn't perfect, but arguably the best mail server around for that reason.


> who runs their own email server? :)

Hilary Clinton


"Qmail is too old to be useable..."

With regard to age, both qmail and Postfix were written around 1998.

For my purposes, qmail works just fine. I use it; therefore it is "useable".

For me, Postfix is overkill. Not fact, just my opinion. Personal preference.

In the event I want to change something, I find it easier to modify and recompile qmail than I do to modify and recompile Postfix.


> With regard to age, both qmail and Postfix were written around 1998.

Postfix has been updated (latest stable release February 8, 2015). Qmail doesn't get updated because DJB considers it complete (last [official] stable release June 15, 1998). There are custom third-party patches you can compile into qmail, so I guess those could be considered updates.


>Is this your opinion, or fact?

Well, it depends on the feature set you need, but to a level that most people might consider fact.

From memory, no support for TLS connections, SMTP AUTH, or SPF/DKIM on mail. That, plus it has very limited support in general and needs an arcane collection of patches to remain workable.

Other mail servers can work around its restrictions, yes, but that doesn't make it fit for purpose.


I used to work as a UNIX admin, so having a home OpenBSD box to handle mail seemed like a requirement. I ran qmail and djbdns, under daemontools, because there was nothing else as secure, nothing else 'properly' designed (I agreed with the design principles behind the software) and nothing else as easy to administer (I really don't like m4).

I spent a long time trying to understand why DJB's software wasn't considered the gold standard and installed by default on all Linuxes and BSDs. I read lots about arrogance, but couldn't see any - I could only see a commitment to solid software.

Eventually the only theory I was left with was that perhaps the 'UNIX way' was something people didn't really understand, or didn't want to invest time into understanding. I'll draw a parallel with vi editor[s]: Those who invest time into understanding the vi philosophy are happy working with it and would rather not use anything else. Others think they (we - you might have guessed I'm a vi user) are somehow ultra geeky or hardcore. Maybe this is why there's a preference for server software with a shorter learning curve, too.

The thing I really didn't 'get' was that to me, djb's tools had no learning curve, because they work 'the UNIX way'. Perhaps this is why those who like[d] his software never made their voice heard enough, or did the work, to get them into mainstream distributions: They don't understand why others don't see that they're great.

There could be other reasons. For example: I gave up running my own mail server when I got sick of dealing with spam and couldn't find a decent web interface. I think squirrelmail was the best I could find at the time and gmail was so much better. Sorry, squirrelmail! Perhaps it was easier to integrate anti-spam software with other mail servers. Perhaps I never found out because I refused to use other mail servers, having seen all the security advisories. Maybe my refusal to consider 'insecure' software meant I was blind to the advantages of sendmail and postfix, rather than simply others being blind to the advantages of qmail.


> I spent a long time trying to understand why DJB's software wasn't considered the gold standard and installed by default on all Linuxes and BSDs.

His license had a "no modifications" clause, for one thing.


That and the fact they decidedly don't work "the UNIX way".


Simple processes that perform a well-defined job. Programs which can be easily modified by people other than the developer. Composable pieces. http://en.wikipedia.org/wiki/Unix_philosophy.

What is the unix way from your perspective?


Perhaps it was "too unix". Too simple processes performing too narrowly specified jobs. It made the whole apparatus difficult to put into operation because of all the moving parts. It'd be like a precision watch that keeps great time, but if somebody just handed you a box of 300 springs and gears, you'd probably prefer a quartz crystal.


By that description, no popular software was working 'the UNIX way' in the '90s and '00s. I avoiding dealing with the abominations of Sendmail and Named for that reason.


I should have also stated that it's not necessarily a bad thing, it just makes certain things (e.g. creating packages) more cumbersome/involved.


Even when it was popular, you had to compile qmail from source after applying third-party patches to provide necessary features. For sites that wanted to leverage a distribution's package manager for seamless integration and security updates, this was a huge barrier to entry.


I switched to qmail when I had an exchange server (early 2000s) that I spent days trying to figure out a configuration problem which finally lead me to emailing MS Support.

Their suggestion was to just back everything up and reinstall. Sometimes that would fix it.

I spent the next 2 days meticulously setting up my RHL/Qmail server and never had an issue. Ran like a champ until that startup died.


Are you people serious? His code is absolutely disgusting:

    uugh = constmap(&mapuser,x,i);
    if (!uugh) die_user(x,i);
    ++i; x += i; xlen -= i; i = byte_chr(x,xlen,':'); if (i == xlen) return;
https://github.com/amery/qmail/blob/aa6bf9739209ca76f7f3af0f...

It's mostly bug-free because it scares bugs away. If I was a bug I wouldn't want to live in there.


What are your issues with it? That the variable names are not descriptive enough to be understood when the code is taken out of context? That the if-substatements aren't compound statements? The last line? What would you have done instead (back then)? Every statement on a new line makes it more difficult to understand it since it belongs together, a macro obscures the code and also has to be named, and a function only for this is (was) too expensive speed and memory wise.


Not the gp, but I agree with him. I dislike everything you listed, but think that even with context it's unreadable. Context is hard to find too since this file has 3 lines of comments only. (For an enum equivalent) Also, I don't buy the "function too expensive" idea - that looks like a crazy micro optimization. If it was needed, I'd say it deserves a comment.


Disliking omitting braces or really short names is just a matter of taste (however both is classic C coding style), but you can easily find out what they store without comments or a longer name. I personally would have chosen l instead of x, which is a pointer to a character in the current read _l_ine (see L202).

Can you tell me what comments you're expecting? qmail comes with a bunch of man pages (including ones for functions and they're well written). The file is called qmail-pw2u.c and there's qmail-pw2u(8). The function is called dosubuser, so let's look up subuser in it:

> Extra addresses. Each line has the form

> sub:user:pre:

That gives us a pretty good idea about the purpose of the posted code (plus the previous lines).

I agree, a function or macro makes sense (since there are many other equivalent constructs in the same file), but that's "just" giving a bunch of statement a name. Does field_next explain what they do?

   static int field_next(int *i, char **x, unsigned *xlen) {
       *++i; *x += *i; *xlen -= *i; *i = byte_chr(*x,*xlen,':');
       return i == xlen;
   }
   / * later */
   if (field_next(&i, &x, &xlen)) return;
The rest of the function is writing something. Maybe we should find out the purpose of the program at first? The man page tells us this too:

>qmail-pw2u reads a V7-format passwd file from standard input and prints a qmail-users-format assignment file.


I can't your extent of sarcasm from the comment, but the final sentence made me cackle: "If I was a bug I wouldn't want to live in there"! ha!


It's extremely terse.

It's a question of priorities, seems like djb prioritizes correctness over readability - which is fine, but does limit the size of his audience.


An eloquent, objective, and thoroughly researched opinion.


DJB had some great comments on what he thought makes qmail secure: http://cr.yp.to/qmail/qmailsec-20071101.pdf

Perhaps the most legit complaint of DJB's work is that he would often lobotomize chunks of a protocol if he didn't like it. But it was still great work and it contrasted nicely to some horribly insecure software at that time (sendmail and bind)


That's an amazing paper, thank you for linking to it! I actually learned something new from it, in particular the sections 5.1 "Accurately measuring the TCB" and 5.2 "Isolating single-source transformations". It turns out there's a wrong way and a right way to do "privilege minimization" for security, and all my life I've been thinking about it the wrong way.


It also shows that the resulting "bug-minimal" code didn't just spring out of nothing but is the result of a lot of experience even two decades ago:

"I started writing an MTA, qmail, in 1995, because I was sick of the security holes in Eric Allman’s “Sendmail” soft- ware."

djb even then analysed the security aspects of the bugs. And spent the considerable time working on the solutions:

"My views of security have become increasingly ruthless over the years. I see a huge amount of money and effort being invested in security, and I have become convinced that most of that money and effort is being wasted. Most “security” efforts are designed to stop yesterday’s attacks but fail completely to stop tomorrow’s attacks and are of no use in building invulnerable software. These efforts are a distraction from work that does have long-term value."

BTW the "TCB" was never explained in the article but I guess he means "trusted computing base."


He also was the person behind Bernstein v United States which the Ninth Court ruled software as freedom of speech.

http://en.wikipedia.org/wiki/Bernstein_v._United_States


> Bernstein was originally represented by the Electronic Frontier Foundation, but he later represented himself despite having no formal training as a lawyer.


Yes he did. He's sort of a badass like that.


DJB a software engineer who doesn't listen to other's input and believes that his way is always the right way. Despite the fact that his software is solid, I try to avoid it all costs because it is a huge pain to configure and manage and doesn't follow the UNIX way at all. It follows the DJB unix way, which is terrible if you want to use anything that isn't DJB.

Also tinydns is fundamentally broken because DJB doesn't believe in split views.


If anything, it's a huge pain to configure and manage because his software follows the Unix way more faithfully than most Unix server software: collections of small programs that communicate over well defined interfaces.

As a programmer, it's great. The very minimal feature set and lack of high level abstractions makes the code very easy to understand and modify. The fact that everything uses stdio makes debugging really easy.

As a sysadmin it sucks because providing the features your users rightly expect forces you to maintain patches as well as configuration. And a good deal of that configuration is long pipelines where tiny changes have huge consequences. Set the magic option in the magic variable to change behavior. Certain files that aren't explicitly mentioned anywhere just have to exist, and one wrong permission or file ownership just breaks everything and it's really annoying to figure out what's wrong.


> his software follows the Unix way more faithfully than most Unix server software: collections of small programs that communicate over well defined interfaces.

In my worldview, that's only part of the unix way. The other part is putting the files in well known and agreed upon locations, like /usr and /usr/local and /etc. DJB software breaks these conventions.

Everything else you said is spot on.


The people who invented Unix disagree with you, or they would have put those files there when they wrote Plan 9.


Are you aware of the client location (%lo) support introduced in 1.04? Since it works at a per-record level---instead of per-zone---I find it more useful than the typical split view support.

http://cr.yp.to/djbdns/tinydns-data.html


I wasn't aware of that, and you're right, that is probably more useful than split view. So I stand corrected on that one point, but dang if it didn't take him a long time to come around.


> But these programs are not just for being seen or read — like a graceful dancer, they move!

This is what makes programming beautiful. The analogy that come most easily to me are to machines, like the wonderful stationary engines that sometimes ran at the Manchester Museum of Science and Industry. But in comparison their movement is so constrained... Dancing is far more representative.


I had the same issue with college classes about software design compared to skycrapers. In my mind software has always been much more dynamic. I'll admit I like them too dynamic (lisp, meta, reactive, context oriented, AI). But even carefully constrained software is moving. Biology is also very fitting.


I appreciate this as an avid user of daemontools (http://cr.yp.to/daemontools.html) -- thanks for the link!


Daemontools is a gift from the unix gods.

Having been a unix user for most of my professional life, I haven't even heard of it until a project for which it was exactly what I needed. When I found it, within an hour I was ready to deploy with all my requirements met and the best of it - it all worked as any experienced unix user would expect it to be! (And I don't remember even skimming the man page.)


The bug thing is about security holes, right? I think the comparison with Knuth is apples-to-oranges.

Or is it actually the case that, since the first public releases of djbdns and qmail, no bugs, security or otherwise, have been found?


There have been several non-security bugs. So far, I think there has been just one security bug in djbdns: http://securityandthe.net/2009/03/05/security-issue-in-djbdn...


As I understand, there has been less than 10 or so non-security bugs each in djbdns and qmail, defined as programming mistakes, not intentional design decisions people disagree with. This is comparable to (or even better than) Knuth.

I don't think there is a handy list of bugs though.


The "one bug was found" only goes for exploitable security vulnerabilities. There have been many other, less impactful bugs. This is still an an impressive archievement, especially because DJB chooses to juggle with chainsaws and write in C.


It isn't even true for that. There have been several security bugs, which he stubbornly refused to fix.


There was one potential security bug (Guninski’s find), which was probably not exploitable on any installation of qmail. I don’t think there have been any other security bugs found in qmail.


Guninski had several, and there were working exploits for them. Nobody uses the OS to set process limits like djb claimed (nor should they), so it was exploitable on any installation of qmail.


I didn't know there were two. There don't seem to be several. This seems to be the best summary of the situation: http://www.jcb-sc.com/qmail/guninski.html

Those two definitely aren't "exploitable on any installation of qmail", as you say; they aren't exploitable on any 32-bit installation of qmail, any ILP64 installation of qmail, any installation of qmail with a reasonable data size rlimit, or any installation of qmail on a machine with less than a couple of gigabytes of RAM and swap, if I understand Guninski's post. If you're running qmail processes without data size rlimits, you're vulnerable to resource-exhaustion denial-of-service attacks in any case. There are established guides to configuring qmail that tell you to set rlimits. Burley's page I linked above says he was running with rlimits before the bugs were found. I'm pretty sure that when I ran qmail, I didn't set rlimits, though.

One of the two bugs is in an unprivileged process, but coupled with a local privilege escalation bug, should be sufficient for root. The other is in a process that already runs as root, although for POP, which was a thing I didn't run when I ran qmail.


I question the reasoning for calling Knuth's abilities into doubt because he had a diary of all of the bugs he encountered along the way in his programs. That is a practice of his that I often feel I should imitate.


I did that on one project in previous lifetime. We had a paper journal that the team used, and I wrote down a description of each and every bug that I made. It improved my code substantially. I have restarted that task again.


Perhaps people are annoyed that DJB does everything his own way, rather than using standard tools? I vaguely remember building DJB's software from source to be non-trivial, but that was a decade ago.


A good time to link to the Unix Security Holes Course djb gave in 2004: http://cr.yp.to/2004-494.html


I don't hate djb, but I won't run his code.

If you should happen to get involved in IETF discussions, I believe you will still occasionally find him there discussing how to fix various protocols. Back in 2002 I was witness/participant to one discussion involving djb that made me think twice.

Because a great deal of software will not be able to handle Unicode (even now) there was some discussion back then about how to handle domain names in older software once domains with non-ASCII characters were allowed. If your software is assuming all ASCII characters, how do you display/handle a domain with greek or cyrillic characters? This obviously affects DNS and Mail software, so djb's contribution should have been valuable.

The consensus from everybody else was a temporary hack: Punycode. http://en.wikipedia.org/wiki/Punycode

djb did not like this. He suggested all software should instead represent the actual characters rather than changing them because humans should be able to read them. He proposed this should be done through ASCII art. So if you were to register ααα.com (that's three greek alphas if your browser can't render them), all software should display this as:

                 _
  /\/ /\/ /\/   /  /\ /\/\
  \/\ \/\ \/\ o \_ \/ |  |


We has very serious, and got very upset when we pointed out this would require re-writing all software deployed to date, and if we were going to do that we'd just make it support Unicode properly.

My problem was that this idea was so batshit crazy I had to then question how crazy his actual implementations were. I dug into the source code of several of his tools.

I came away feeling that the reason bugs had not been found in his code were not because his code was bug-free, but because it is impenetrable noise.

As others have suggested here, it is uncommented, he uses a lot of pointer arithmetic, some parts don't make sense. Making a contribution to improve it is difficult. Identifying if the code is correct or not is near impossible.

DJB's code might be bug free, but it is also quite likely to be bug-ridden but nobody has been brave or bored enough to figure out where those bugs are.

I expect it does not have the same holes as BIND or sendmail (both of which are awful), but it will have holes.

The wonderful beauty of open standards and the work that the IETF has done of course, is that it means we have choices and if you want to run djb's code, you can go and do that. If you don't - and I do not - you have choices of other software that will inter-operate with it as expected.

I don't hate djb, but I don't trust the claims made about his code being watertight, and I do not believe somebody who writes code that is hard for others to understand is worthy of the title "greatest programmer who has ever lived".

Good luck to him and all who use his code, but no thanks, not for me.

P.S. - if you want to see what good code does look like in an MTA, take a glance at the source for exim one day. It is very clear and well structured, I think.


If you're referring to this old thread[1], I would say you're misrepresenting things a bit, especially the "all software" and "very serious" parts.

1: http://www.ietf.org/mail-archive/web/ietf/current/msg21058.h...


"There are several options. One option is to work around the hardware limitations in software, displaying something like

          |
   /\/ /\ |   /^ /\ /\ /\
   \/\ \/ | * \_ \/ | | |
Another, much more popular, option is to move your email reading, web browsing, etc. from your 1970s-vintage VT100 to a graphics terminal. Have you considered the VT340, for example? Or an IBM PC, model 5150?"


Just for completeness' sake: The 5150 came out in 1981, the VT340 in 1987. The thread was from 2002.

I'm German, so take this with a grain of salt: I think there might've been humor involved.


Wait, is the parent by any chance the same "Paul Robinson" in that thread? Or is this some crazy coincidence?


> if you want to see what good code does look like in an MTA, take a glance at the source for exim one day.

http://www.cvedetails.com/vulnerability-list/vendor_id-10919...

http://www.cvedetails.com/vulnerability-list/vendor_id-86/Da...


Just one more remotely exploitable bug? That's nothing.


While this [the ASCII art story] is an amusing anecdote, it seems highly unlikely that the reason qmail and djbdns don't have any bugs is because "nobody has been brave or bored enough" to look.

Could there perhaps be some other interpretation of his stance in that debate? For example, maybe it was an elaborate joke about the SNAFU that is maintaining compatibility between ASCII and Unicode?


I found djb's message is which he proposes this (in a reply to parent): http://www.ietf.org/mail-archive/web/ietf/current/msg21058.h...

I think parent is very much misrepresenting djb's position. The ASCII art was a (off-hand) suggestion to display UTF-8 domains in a VT-100, not a general implementation.


"P.S. - if you want to see what good code does look like in an MTA, take a glance at the source for exim one day. It is very clear and well structured, I think."

Full of bugs, 2 code execution bugs last year. www.cvedetails.com/product/19563/Exim-Exim.html?vendor_id=10919



Which, in spite of its 'perfection', it should be noted that you should not run unpatched on a public facing server (or at least not qmail-smtpd), lest you become an unwitting backscatter zombie.

I love djb's software, and I love his way of looking at things, but no software stays perfect forever.


Had to look up what a backscatter zombie was; apparently it's what happens when you accept bogus mail (spam) into your mail queue without checking if it is actually deliverable -- thus end up bouncing it once the queue is processed for delivery -- and typically then sending it back to a forged address -- ie: you become a spam gateway (spammers send you email with sender-addresses they want you to try to send spam to...):

http://serverfault.com/questions/111938/how-might-i-stop-bac...


If only djb's programs would write human-readable log files...


If you are talking about the date, it's documented how to parse it :

> cat current | tai64nlocal


Since we're talking about good programming here, I have to mention this:

https://en.wikipedia.org/wiki/Cat_(Unix)#Useless_use_of_cat


Yes, it's the damn date. It doesn't matter how well documented it is, once you have to apply a conversion to logfiles before they can be easily read, you lose all your existing workflow. It becomes a pain to use.

I never understood djb's aversion to using human-readable dates in the log files. It's not a lot of effort for the computer to produce or to consume them, so we really should push the effort onto the computer, and not onto the user.


    1. push the effort onto the computer
    2. complexity increases
    3. more bugs
Given the number of bugs dealing with human-friendly date so far, it's not hard to predict which way djb prefers. Besides, it's just a logfile, not a calendar. The need for human-friendly date is probably overrated in this case.


Yeah djb's way suggest we use TAI time format, because it's more accurate. Anyway I haven't found this to be so big of a hassle for me. And it sounds really cool what it's doing :

> The standard timestamp used in "the djb way" is TAI, for Temps Atomique Internationale. It is an absolute measure of elapsed time based on the decay of cesium atoms, where one day is equivalent to exactly 86,400 TAI seconds.

> But the actual rotation of the Earth does not exactly conform to the TAI idea of time. In fact the Earth is slowing down a little bit, so that each physical day is minutely more than 86,400 TAI seconds.

> Our human notion of time is based on this physical day length, and is represented by the UTC timeclock. The UTC timeclock is intended to represent a clock that points exactly 12:00 noon when the Sun is directly overhead on the solstice at 0 degrees longitude.

http://thedjbway.b0llix.net/leapsecs_update.html


Yes, leap seconds are an abomination and something that should only exist for odd use cases, like astronomers or something. Forcing UTC into all software and civil use is vile. Hopefully UTC will stop getting leap seconds, or we'll move to something like UTC with no further leap seconds.


Yeah there should be a decision on that this year: http://www.ucolick.org/~sla/leapsecs/


The aversion to human-readable timestamps is that he's using TAI to encode the timestamp so you don't e.g. get human-readable timestamps that have the same second twice after a leap second because the format is unambiguous in this regard.


The programs themselves log clearly. multilog -- the log sink that handles timestamping and rotation -- does that, but you're not required to use it.

In many cases, I exec logger(1) instead as my log sink, which will emit logs to syslog, or a homegrown replacement that can handle lines longer than 8192 bytes.


Personal opinion: the world needs more code generators like qhasm.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: