People really mean different things when they say "simple".
For someone it may be "the abstraction is so clean, small and self-contained that I don't have to look at the implementation", for others "the implementation is so clean, small and self-contained that I don't need any abstraction". The author of this post clearly subscribes to the latter philosophy.
I guess many flamewars on mailing lists ultimately boil down to this culture clash.
The best designs have both clean abstractions and small/simple implementations.
If you have a clean abstraction but a complicated implementation, then there are probably lots of moving parts and sub-steps that can go wrong. No user is completely isolated from implementation: take shell completions for example. Shell completions are this nice simple interface -- push "TAB" and the shell will help you out if it can. But because there is so much machinery going on in the background, a slow or unavailable NFS volume (for example) can make your completion suddenly grind to a halt, even if you don't care about that NFS volume right now.
What Richard is talking about in this example is having as little extraneous stuff going on as possible. The fewer moving parts, configuration files, commands you have to run, etc. the less there is to understand, the fewer states the entire system can be in, and the fewer sub-steps there are that can go wrong.
The hallmark of a truly great abstraction is that it gives you the functionality you need without getting in your way, and that higher-level abstractions can be layered on top without the lower-level abstraction getting in the way.
It is not widely known that you can have both clean abstractions with simple implementations.
At the Viewpoint Research Institute[1] (co-founded by Alan Kay), they are trying to have their cake, give it to everyone, and eat it too[2]. They already have some successes, with for instance a set of several programming languages that can implement itself, from bare-bones X86 to sky-high abstraction, in less than 1500 lines; a TCP stack that takes 200 more lines (a 50 fold reduction compared to a typical C implementation); and more.
Efficiency wasn't even the priority. But as it turned out, many optimizations were either very generic (across several languages), or plain unnecessary (some things are faster than anticipated, to the point of being good enough).
There is, ultimately, a direct conflict between abstraction and efficiency. Abstraction gets its power by using indirection: generalizations that stand in for specific cases. Making abstractions concrete involves a flattening of all these indirections. But there's a limit on our ability to automate the manipulation of symbols - we don't have Strong AI that is able to make the leaps of insight across abstraction levels necessary to do the dirty hackish work of low-level optimization. The fabled "sufficiently smart compiler" doesn't exist, and is unlikely to until we have Strong AI.
I'll further submit that a design that has clean abstractions and simple and small implementations either doesn't do much to the point that it's not very useful on its own, or if it is useful the complexity has moved somewhere else, perhaps in metadata or configuration or tooling. It's like there's a law of conservation of complexity; there is a certain amount of irreducible complexity in useful programs that cannot be removed, and remains after all accidental complexity has been eliminated.
Seeing http://vpri.org/html/work/ifnct.htm I think we have good reasons to believe that we currently are several orders of magnitude above that amount of irreducible complexity.
So, while I mostly agree with what you just said, I don't think we've hit the bottom yet. Silver bullet-like progress still look possible.
No, the dichotomy is better described as r-selection (worse-is-better) versus K-selection (the right thing)[0].
For a practical example of worse-is-better software becoming popular, look at the World Wide Web in relationship to Project Xanadu[1]. For a practical example of The Right Thing software being popular, look at Python. The argument is broadly that New Jersey worse-is-better produces software more likely to become successful than the MIT right-thing approach. That doesn't mean that worse-is-better is always more successful, or that the MIT approach can't produce successful software, or that there's not a time and/or place for both approaches.
No, I don't think that captures the ideas in Richard Gabriel's essays. Simple vs. The Right Thing are the two sides, where the right thing is a complete and polished piece of software full of features, what a typical young engineer would create if given lots of time and resources to do things "the right way".
I think he calls it "worse" as a way of emphasizing that e better course of action may not feel fully responsible, as in the PC-losering example.
It's mentioned in the paper "EROS: a fast capability system" (http://www.eros-os.org/papers/sosp99-eros-preprint.ps): they write "This design is similar to that adopted in Fluke and MIT’s ITS system" and cite the ITS reference manual for the latter.
(I remembered reading that paragraph since before that I had only ever seen the technique mentioned in Gabriel's essay :) ).
Well, my point is this: If only three or four OSes have ever implemented it, none of which have been very successful, how important could it possibly be? Do other OSes have ways of making the problem irrelevant without actually fixing it like ITS, Fluke, and EROS did?
One of my favourite quotes from C.A.R. Hoare seems relevant:
"There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult. It demands the same skill, devotion, insight, and even inspiration as the discovery of the simple physical laws which underlie the complex phenomena of nature."
It's kind of interesting that his approach, using essentially 1980s-era scaling technology (inetd forking a server per request, plus old-style CGI for dynamic content) has no trouble at all running these high-traffic websites on a tiny VPS.
edit: Though come to think of it it shouldn't be that surprising. If that architecture managed to work at all on 80s hardware, it should scream on anything from 2011, even a VPS.
Persistent server processes are an artifact of the shift towards Python, Ruby, and other languages which optimize programming time over execution efficiency. When just starting up the binary takes over a second (to load frameworks and libraries and plugins and so on), the process-per-connection model is completely unworkable. That's why FastCGI, SCGI, WSGI, and other similar protocols all became popular at the same time as Rails and Django.
If you're willing to use pre-compiled binaries, with few or no large dependencies, then CGI becomes practical once more. From the link, it sounds like the Fossil website is run entirely on such binaries (not surprising, considering the author).
I can confirm it does run on a single binary. I used Fossil for weeks, and its internals are just plain beautiful -- so much actually it almost makes me write websites in C++. (I know Fossil is in C.)
Executing a C program with some non-trivial shared-object dependencies may also incur some significant overhead (dynamic linkage). But it is of course possible to statically link.
There's the really weird aversion to fork among people who don't understand just how fast and efficient fork is on Linux (and BSD) systems. It's an extremely common misconception, that occasionally leads to people over-complicating their solutions to problems.
I see it with databases, too...people often want to interject MySQL or another heavyweight database into situations where a flat text file would be vastly faster and vastly more efficient.
It's a problem of thinking that solving problems in the large is the same as solving small problems.
Fork goes out the window if you're connecting to a database, need to keep any structures in memory, or can envision a need to work on Windows. This covers a heck of a lot of situations.
And fork is still kind of expensive. Say a CGI just prints out a few K; the forking time will be relatively substantial. If you can afford it, who cares, but it is an issue.
High-traffic websites ? He said quarter-million requests a day, so 3 requests per second (and inetd forks a process per connection, not per HTTP request).
To his credit, he sort of clarified that by implying it's "high traffic - for the sort of website that sits on 1 of 20 VM on a single physical server". If you do the naive multiply-everything-out-assuming-it-all-scales-linearly, 3% cpu on 1/20th of the physical server suggests an upper limit somewhere in the region of 165 million requests per day - that'd certainly count as a high-traffic site...
Well, high-traffic by certain standards. Higher-traffic than what seems to cause many server setups to fall over, anyway; front-page on HN or Slashdot doesn't send more than that much traffic, and yet a bunch of sites seem to be unable to handle it.
In case somebody doesn't know D. Richard Hipp not only wrote SQLite, the SQL engine used in iPhone, Chrome browser and who knows where else, he also put that in public domain.
Moreover, SQLite increases the executable only around 260 KB. That was one of the design goals!
However he is only able to do this because he writes in C. Python and Ruby are not slow, but they have terrible startup time, which causes a huge amount of overhead for CGI. They probably do 1,000,000x the work of a C program before getting to main(). I bet he statically links his binaries (or at least has few dependencies on share objects) because that has a pretty big cost too.
I wonder if he writing the CGIs in Lua would have the same efficiency. In Lua you would pretty much have to fork because it doesn't have threads and the interpreter is not thread-safe (no global interpreter lock). Or maybe v8 would work too.
Uh, no. Lua is thread-safe in the sense that you can use multiple Lua states in the same program. That is exactly what Lua lanes and the other package are doing. And what is described in the Programming in Lua book by Robert I.
But you can't access the same lua_State from 2 C threads, and you can't share data structures between 2 separate lua_States -- you would have to serialize all your data structures the message-pass between them. So if Lua had an interpreter lock, it would enable a Lua-threading library which Python and Ruby have. The lock would just belong in the lua_State and not be an actual C global. (not saying it should add this; just pointing out the difference)
You're use of the word "slow" doesn't have any meaning. What I meant is that Python and Ruby are not slow in the sense that you couldn't write "scalable" CGIs style with them in Hipp's style IF they didn't have horrendous startup time. That is, once the interpreter is started, you can do a LOT of work in 50ms of Python or Ruby (which is exactly what sites that serve billions of page view a month are doing). It's just that loading the interpreter can take upwards of 100ms with a lot of libraries. So with those languages, people use persistent servers rather than what Hipp is advocating.
In Lua you could have a persistent C program and initialize a new lua_State for every request. That would be the moral equivalent of CGI, without the fork(), since all the state is wiped between every request. But, getting back to the original point, that wouldn't retain the ease of administration that Hipp wants because it wouldn't work with inetd.
FWIW there are numerous libraries available for OS threads in Lua. Some are low-level, directly binding the OS threading API[1], some are high-level[2], supporting message passing and atomic counters, etc.
Right, but this is different than Python and Ruby's threading, where threads can share state because they have a global interpreter lock. Lua has no such thing so both of those packages have to use message passing, which I think is good but not convenient in a lot of cases. Even though they are using threads, there is one Lua state per thread, so your threads have to serialize everything before communicating.
As mentioned in the PIL book, they are better termed "Lua Processes" even though they are implemented using OS threads.
Does Lua not have the startup time issues of Python/Ruby? If it does, then to counteract the lack of threads you'd probably use a coroutine-based system like gevent and Python's greenlets.
matthew@rusticanum:~$ time python2.6 -c 'print("hello world")'
hello world
real 0m0.027s
user 0m0.024s
sys 0m0.004s
matthew@rusticanum:~$ time lua -e 'print("hello world")'
hello world
real 0m0.005s
user 0m0.000s
sys 0m0.000s
That's a good comparison to make, but once you have a Python program or Ruby program importing 20 packages, each with several modules, then it gets really slow. Like 100-500ms, which is more work than the "real" work of a handling many HTTP requests. Bad packages can do arbitrary computation at import time.
To be fair, I haven't run any Lua programs with 20 packages... AFAIK Lua didn't have a module system until Lua 5, and it's not as capable as Python or Ruby's.
I love small and simple, I love self-contained. But then I got to GetMimeType, where he wrote his own binary search code instead of calling the C library to do it.
Though it sounds simple, binary search is a minefield of edge cases and I've seen textbooks that do it wrong.
So when I read the GetMimeType function, I didn't think "how nice and simple"...I thought "Hmm, I bet that doesn't work."
Oh, I'm sure they are, purely based on who wrote them...but in any ordinary codebase I'd toss this code out in a second and replace it with a library call. It's just not worth the effort to decide whether it's correct -- like not bothering to bend over to pick up a lost penny.
For those who do use inetd, you might be interested in xinetd which has some access controls. I've been using this for years as well for very simple apps and it "just works."
Abstraction gets its power by using indirection: generalizations that stand in for specific cases. Making abstractions concrete involves a flattening of all these indirections. But there's a limit on our ability to automate the manipulation of symbols - we don't have Strong AI that is able to make the leaps of insight across abstraction levels necessary to do the dirty hackish work of low-level optimization
The on request sever design is an interesting approach. It seems like it would provide a good framework for a small personal site. I wonder where this fits between using a site generator and pushing to github (or similar push based cloud provider) vs running a full webstack. I guess it comes down to preference and requirements. Good way to provide server stats via http though.
I think in general thats what engineers try to do, make it simple so its fun to make and easy to maintain but I have found that complexity is almost always added by the business need where its by business necessity or some idiot sales guy adding what he thinks is cool and not really forward thinking it.
To make something simple for the user it means that it needs to be well thought... Surprisingly even though humans have had brains for a really long time its hard to think :)
For someone it may be "the abstraction is so clean, small and self-contained that I don't have to look at the implementation", for others "the implementation is so clean, small and self-contained that I don't need any abstraction". The author of this post clearly subscribes to the latter philosophy.
I guess many flamewars on mailing lists ultimately boil down to this culture clash.