Ask HN: What's the most elegant piece of code you've seen?

nostrademons · on Jan 31, 2016

Jeff Dean's original implementation of MapReduce.

It was, IIRC, only 3 C++ classes and just a few hundred lines of code. It outsourced much of the distribution, task running, and disk-access tasks to other Google infrastructure, and only focused on running the computation, collecting results for each key, and distributing to the reducers.

The current (as of ~2012, so not that current anymore) version of MapReduce is much faster and more reliable, but there's a certain elegance to starting a trillion-dollar industry with a few hundred lines of code.

There was another doozy, also by Jeff Dean, in the current (again, as of 2012) MapReduce code. It was an external sorting algorithm, and like most external sorts, it worked by writing a bunch of whole-machine-RAM sized temporary files and then performing an N-way merge. But how did it sort the machine's RAM? Using the STL qsort() function, of course! But how do you sort ~64GB of data efficiently using a standard-library function? He'd written a custom comparator that compared whole records at a time, using IIRC compiler intrinsics that compiled down into SIMD instructions and did some sort of Duff's-Device like unrolling to account for varying key lengths. It was a very clever mix of stock standard library functions with highly-optimized, specialized code.

StephanTLavavej · on Jan 31, 2016

Minor correction: qsort() is CRT, std::sort() is STL.

nostrademons · on Jan 31, 2016

My memory's actually hazy over whether it was qsort or sort; my intuition is that it would've been qsort because QuickSort is what you'd use when you need an in-place sort with little additional RAM required, but it's been so long that I honestly don't remember.

JoachimSchipper · on Jan 31, 2016

Not a C++ programmer, but isn't std::stable_sort usually mergesort, while std::sort is usually an introspective quicksort?

jibalt · on Jan 31, 2016

nostrademons · on Jan 31, 2016

Ah, good. I'd initially written std::sort in the comment and then went back and edited it because I was like "Isn't std::sort usually mergesort? That wouldn't work here because it takes extra space." It's been a while since I've written C++.

chengiz · on Jan 31, 2016

You are assuming qsort is Quicksort and std::sort is not. Both typically use quicksort; but neither is required to.

StephanTLavavej · on Jan 31, 2016

In C++11, std::sort() is forbidden from being just a quicksort, as it's required to have worst-case O(N log N) complexity.

vram22 · on Jan 31, 2016

I'd read somewhere [1] that the built-in Python sort function has a lot of good / clever optimizations too, though maybe not of the same kinds that you describe, i.e. may not be at machine language level. Tim Peters did a lot of that, per what I read, though others may have too.

[1] Think I read it in the Python Cookbook, 2nd Edition, which is a very good book, BTW.

hayd · on Jan 31, 2016

Time Peters wrote Timsort for python https://en.wikipedia.org/wiki/Timsort

I think quite a few languages adopted it as their default sort.

andrewf · on Feb 11, 2016

Java's standard lib uses Timsort for sorting Object[], and quicksort for arrays of primatives.

http://stackoverflow.com/questions/4018332/is-java-7-using-t...

vram22 · on Feb 1, 2016

Ah, I remembered his name, but had forgotten that it was called Timsort, thanks.

Interesting that others also used it.

nostrademons · on Jan 31, 2016

Different kinds of optimizations. Timsort tries to collect runs and falls back on insertion sort for them; it exploits the fact that much real-world data is already partially sorted to reduce the number of comparisons made. This optimization exploits the fact that MapReduce keys are always strings in contiguous areas of memory, and are often fairly large, to compare them really quickly.

ganduG · on Jan 31, 2016

Do you have a link to the code?

andor · on Jan 31, 2016

He's talking about proprietary Google code.

nostrademons · on Jan 31, 2016

Dead proprietary Google code, too - it's long since replaced, I was looking back in the version control history out of curiosity.

dividuum · on Jan 31, 2016

Since it's old and no longer relevant to the Google, it would be really interesting to have that code in some kind of code museum, as I feel it gave an interesting insight on how Google did big data (the real one, not the marketing one) back then. Not sure it that's feasible, but I guess it doesn't hurt to ask.

nostrademons · on Jan 31, 2016

I wish, but it's unfortunately not my call to make. A few other companies have done this, eg. Microsoft open-sourcing Altair Basic about 30 years after it came out or id open-sourcing DOOM. Maybe if I ever go back to work for them, I can propose it. For now, consider getting a job at Google if you want to peek into the VCS history.

Chris_Newton · on Jan 31, 2016

On a larger scale than most suggestions so far, I’ve always been impressed with SQLite. The developers have managed to create a useful-in-the-real-world tool, small and efficient enough for even quite demanding applications like embedded systems work.

The original source code was solid and well documented in the parts I’ve seen, but the remarkable thing to me is the way they distribute it: it comes as a single ANSI C file and a single matching header, which they refer to as the “amalgamation”. It therefore works just about anywhere, has no packaging or dependency hell issues, can be be incorporated into any build process in moments, and can be statically linked and fully optimised.

weeksie · on Jan 31, 2016

Love that source code. I dug around in it quite a bit when I was teaching myself about B-tree implementations about ten years ago.

bosky101 · on Jan 31, 2016

Joe Armstrong - the creator of erlang would turn his cluster of 1000's of machines to change behaviour. The code is really 5 lines.

    My favorite Erlang program
    21 Nov 2013
    The other day I got a mail from Dean Galvin from Rowan    
    University. Dean was doing an Erlang project so he asked 
    “What example program would best exemplify Erlang”.
    ...
    The Universal Server
    Normally servers do something. An HTTP server responds to HTTP requests
    and FTP server responds to FTP request and so on. But what about a Universal 
    Server?
    surely we can generalize the idea of a server and make a universal server that 
    which we can later tell to become a specific sever.

    Here’s my universal server:

    universal_server() ->
        receive
           {become, F} ->
               F()
        end.
    
    ...then I set up a gossip algorithm to flood the network with become messages. 
    Then I had an empty network that in a few seconds would become anything 
    I wanted it to do.

a process in erlang, is nothing but a tail-recursive function. the moment it stops being so - it dies. so here it morphes into F; which can be passed in.

more at http://joearms.github.io/2013/11/21/My-favorite-erlang-progr...

wasd · on Jan 31, 2016

I love the story of the fast inverse square root. A bizzare piece of code from quake 3 shows up on usenet with a magic constant that calculates the inverse square root faster than table lookups and approximately four times faster than regular floating point division. Inverse square roots are used to compute angles of incidence and reflection for lighting and shading in computer graphics. Author unknown but was once thought as of Carmack.

https://en.m.wikipedia.org/wiki/Fast_inverse_square_root

MichaelRenor · on Jan 31, 2016

I would say that the actual source code there is extremely ugly. It may be an elegant solution, but there's no way I would want to crawl around in that repo:

    x2 = number * 0.5F;
	y  = number;
	i  = * ( long * ) &y;                               // evil floating point bit level hacking
	i  = 0x5f3759df - ( i >> 1 );               
    // what the fuck? 
	y  = * ( float * ) &i;

dividuum · on Jan 31, 2016

That has to be LuaJIT. The amount of knowledge hidden in that code base is mind blowing. It contains a state of the art tracing JIT compiler, optimized for at least 5 different platforms, a custom assembler (dynasm) which is used for implementing hand optimized code for the interpreter in 6 different architectures. Additionally it has one of the best FFI implementations I've ever seen: Just parse a C-header file at runtime and you can call into C code with zero overhead. All of that written by a single person (Mike Pall). If you haven't had a look, you should: https://github.com/LuaJIT/LuaJIT/tree/v2.1

PeCaN · on Jan 31, 2016

I was going to say LuaJIT also. JIT compilers have a habit of looking rather like chicken scratch, but LuaJIT is pretty readable (for a JIT).

Specifically, the way it manages traces (lj_trace.c, lj_record.c) is really elegant. It's not much code, but it's the best available tracing JIT compiler and manages traces quite elegantly IMO.

Mike Pall is legitimately a genius.

danielvf · on Jan 31, 2016

The Redis source code is hand down the best C I've ever seen. And it's not just the code that's beautiful - the tests are too.

https://github.com/antirez/redis/blob/unstable/src/dict.c

tebeka · on Jan 31, 2016

Peter Norvig's Spell Checker http://norvig.com/spell-correct.html

A lot of other code he writes as well.

krat0sprakhar · on Jan 31, 2016

I really love this blog post by Norvig. Last Christmas I was looking for a trivial project to try Clojure and had a lot of fun working through this. Highly recommended - https://github.com/prakhar1989/clj-spellchecker

biot · on Jan 31, 2016

Particularly "Solving Every Sudoku Puzzle": http://www.norvig.com/sudoku.html

anonfunction · on Jan 31, 2016

Also one of my favorites, and also like another reply to this comment I ported it[1] to a language I was learning, Golang.

1. https://github.com/montanaflynn/toy-spelling-corrector

d13 · on Jan 31, 2016

Commodore 64's Basic random maze generator:

10 PRINT CHR$(205.5+RND(1)); : GOTO 10

https://www.youtube.com/watch?v=m9joBLOZVEo

pkaye · on Jan 31, 2016

Somebody managed to write a whole book about this random maze generator. http://10print.org/

jibalt · on Jan 31, 2016

That book is remarkable!

agildehaus · on Jan 31, 2016

It prints either \ or /, not sure how amazing that is. It also isn't a maze, considering it's unlikely any of those paths leads to an exit.

zug_zug · on Feb 1, 2016

Fair point. Making a a traditional 1-solution maze in 1 line is qualitatively more impressive than making a complex pattern of passages in 1 line.

jibalt · on Jan 31, 2016

I think you have quite missed the point.

austinjp · on Jan 31, 2016

I've got to say, that is truly wonderful.

lokedhs · on Jan 31, 2016

The Solaris kernel source. I got a deep appreciation for the elegance of it when I was working for Sun a long time ago. I can enjoy analysing that code base, which is opposite to the feeling I get when looking at the Linux code.

It's not overly clever, but it's incredibly clear and easy to understand.

jibalt · on Jan 31, 2016

Much of that code was inherited from Bell Labs UNIX, much of which came from Ritchie and Thompson.

lokedhs · on Jan 31, 2016

While I don't know exactly what parts were inherited, I have looked at parts of the code that definitely was not inherited from Bell Labs (for features that are comparatively new) and they are also very nice to read.

Perhaps they just followed in the footsteps of the masters, but whatever the reason, the code is, in opinion, incredibly nice to work with.

gaze · on Jan 31, 2016

Jeff Bonwick is one of the best systems programmers ever

jamessantiago · on Jan 31, 2016

Doom 3 was touted[1] as having some "exceptional beauty." Naming, spacing of properties, consistency, and how multiple parameterized calls were formatted are quite nice. You can see an example on github[2], but for some reason the spacing is a bit off in the web view. I'd recommend cloning a local copy and taking a look.

[1] http://kotaku.com/5975610/the-exceptional-beauty-of-doom-3s-...

[2] https://github.com/id-Software/DOOM-3/blob/master/neo/game/a...

Splines · on Jan 31, 2016

> but for some reason the spacing is a bit off in the web view.

It looks like Github doesn't explicitly set tab-size so it defaults to 8. 4 seems to work better.

zastrowm · on Feb 1, 2016

You can set the github tab size via query parameter [1]. The Doom 3 code formats quite nicely then [2].

[1]: http://stackoverflow.com/questions/8833953/how-to-change-tab...

[2]: https://github.com/id-Software/DOOM-3/blob/master/neo/game/a...

paisawalla · on Jan 31, 2016

Fast inverse square root[0] is something that I encountered in the mid-2000s in the Q3A source code. It took me a really long time to understand it, and I eventually had to show it to some professors before I really understood what was going on and why this worked.

That's really an example of how arbitrary human thought processes are. When you release the constraint that your code has to have some human-comprehensible analog, you might arrive at interesting results.

[0] https://en.wikipedia.org/wiki/Fast_inverse_square_root

pkaye · on Jan 31, 2016

Most of that is fairly straightforward. It is using newton's method to calculate the inverse square root. But to get that with one or two iterations, you need a good estimate to start with. The square root of the floating point exponent is half the value. Knowing how the floating point number is packed, we know a right shift is equal to divide by two and negate it. What remains is how the shift affects the mantissa and if some correction factor is needed. This could have been gotten by least square optimization to minimize the error.

paisawalla · on Feb 1, 2016

> Knowing how the floating point number is packed, we know a right shift is equal to divide by two and negate it.

Shifting an IEEE754 floating point number does not have that effect.[0] The fact that it doesn't do that is the source of the "mystery" of fast inverse square root.

[0] https://gist.github.com/jessedhillon/386fa964e822e529f6c1

afrancis · on Jan 31, 2016

It has been a while since I looked at it, but I was impressed by Russ Cox's implementation of channels in the LibThread library of Plan 9. All channel operations including blocking on several channels and selecting one that is ready, are described by Alt structures, which under the hood implement a simple but elegant algorithm that appears in Rob Pike's "The Implementation of Newsqueak." I feel you want to understand roughly how Go channels works, read that paper and look at LibThread's channel implementation.

A second choice would not be so much code, but an algorithm: Thompson constructions for regular expressions.

frou_dh · on Feb 3, 2016

Yeah, the blocking select on multiple channels inside an infinite for { } struck me as the coolest construct in Go. It's like a little nerve centre.

Radim · on Jan 31, 2016

For me, the memorable pieces of code are those that made me "want to become a programmer", to choose the path I chose, way back in the day.

Unsurprisingly, these were games:

1) A C64 "SnakeByte-like" game, whose exact name I forgot. It was written entirely in BASIC 2.0 (so you could list and read the source code), and with me having no C64 manual, and no English, it was a true revelation. So much fun and beauty emerging from such a concise, approachable program!

2) An ancient five-in-a-row implementation, I think in BASIC again or maybe Pascal. I remember the shock after seeing how simple the code was, compared to its (surprisingly good) playing strength and speed. My github user name, "piskvorky", is an echo of this old experience :-)

The underlying appeal seems to be a combination of simple, elegant rules giving rise to complex and fun behaviour. That, to me, is elegance.

FrankyHollywood · on Jan 31, 2016

This question was asked to some top programmers and computer scientists. The answers are collected in this book:

http://www.worldofbooks.com/catalog/product/view/id/2366045/...

SixSigma · on Jan 31, 2016

The Plan9 Source code

https://github.com/0intro/plan9

> A Professor of Computer Science gave a paper on how he uses Linux to teach his undergraduates about operating systems. Someone in the audience asked 'why use Linux rather than Plan 9?' and the professor answered: Plan 9 looks like it was written by experts; Linux looks like something my students could aspire to write.

But see also the HN thread :

Code which every programmer must read before dying

https://news.ycombinator.com/item?id=2466129

eps · on Jan 31, 2016

I worked for a networking software developer at some point in the past and they were considering licensing a VPN stack from the SSH company [1]. That thing was in C and the sample code included was stunningly beautiful as was the documentation - well modularized, consistent, good naming notation, but above all it was concise. I think I still have a CD with the SDK demo, I can pull up some code from there if anyone's interested.

[1] https://en.wikipedia.org/wiki/SSH_Communications_Security

ra7 · on Jan 31, 2016

I was reading some of the Python standard library code and one piece stood out for me in the heapq.py module.

Merge sorted iterables using min heap:

  def merge(*iterables):
      '''Merge multiple sorted inputs into a single sorted output.
      >>> list(merge([1,3,5,7], [0,2,4,8], [5,10,15,20], [], [25]))
      [0, 1, 2, 3, 4, 5, 5, 7, 8, 10, 15, 20, 25]
      '''
      _heappop, _heapreplace, _StopIteration = heappop, heapreplace, StopIteration
      _len = len

      h = []
      h_append = h.append
      for itnum, it in enumerate(map(iter, iterables)):
          try:
              next = it.next
              h_append([next(), itnum, next])
          except _StopIteration:
              pass
      heapify(h)

      while _len(h) > 1:
          try:
              while 1:
                  v, itnum, next = s = h[0]
                  yield v
                  s[0] = next()               # raises StopIteration when exhausted
                  _heapreplace(h, s)          # restore heap condition
          except _StopIteration:
              _heappop(h)                     # remove empty iterator
      if h:
          # fast case when only a single iterator remains
          v, itnum, next = h[0]
          yield v
          for v in next.__self__:
              yield v

If I'm not wrong, this is written by Raymond Hettinger and it's always a pleasure to read/use his code.

hayd · on Jan 31, 2016

The scope lifting (I forget what this performance hack is called), `_len = len` etc., is unsatisfying.

elliptic · on Jan 31, 2016

I noticed that, but haven't any idea what the purpose is?

pmiller2 · on Jan 31, 2016

It's a speed optimization. Local lookups are faster than global lookups in Python.

sssilver · on Jan 31, 2016

Naive factorial implementation in Haskell

factorial 0 = 1

factorial n = n * factorial (n - 1)

agumonkey · on Feb 11, 2016

fib, with its tail biting zip felt even better.

one of my favorite intermediate recursive function is powerset

davidjnelson · on Jan 31, 2016

I really like the code for redux. It's only a few hundred lines of code, but has spawned an ecosystem of plugins and has over 13,000 github stars already.

https://github.com/rackt/redux/tree/master/src

joncampbelldev · on Jan 31, 2016

Indeed its only 99 lines of code if you strip out comments etc. https://twitter.com/dan_abramov/status/646299844487872512 https://gist.github.com/gaearon/ffd88b0e4f00b22c3159

A truly elegant solution to state management.

visof · on Jan 31, 2016

quicksort [] = []

quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater)

    where

        lesser = filter (< p) xs

        greater = filter (>= p) xs

alisey · on Jan 31, 2016

Keep in mind that this algorithm is not in-place, so it wouldn't be fair to say that it's more 'elegant' than imperative quicksort implementations.

semberal · on Jan 31, 2016

Definitely the implementation of a (simplified) Scheme interpreter in Scheme we were doing years ago in a functional programming course on university. It made me fall in love with functional programming.

pillmuncher · on Jan 31, 2016

From Sterling and Shapiro's "The Art of Prolog" a procedure to concat two difference lists in O(1) time (from memory):

    append_dl(A - B, B - C, A - C).

projectileboy · on Jan 31, 2016

I learned a lot reading Paul Graham's source code for Arc. And John Carmack's (and others) code for Quake is also very educational.

EliRivers · on Jan 31, 2016

C string copy.

    while (*p++ = *q++);

Teckla · on Jan 31, 2016

Interestingly, I consider that deeply inelegant. But it sure is concise.

EliRivers · on Jan 31, 2016

It's all in how one defines "elegant", I suppose. It relies on a handful of almost side-effects and the lucky confluence of a series of unrelated language choices; I see this and it reminds me of a dance, someone seeing the swinging pieces of the code and making a minor touch in just the right place, taking advantage of edge-effects that were never designed for exactly this but combine to make something emergent so simple and easy. I see this as breathtakingly elegant.

lutusp · on Jan 31, 2016

> What's the most elegant piece of code you've seen?

Well, not the most elegant, but a very elegant piece of code, solves the Towers of Hanoi puzzle in a few ingenious lines (Python as shown):

    def printMove(disk,src,dest):
      print("Move disk %d from %s to %s" % (disk,src,dest))
    
    def moveDisk(disk,src, dest, using):
      if disk >= 1:
        moveDisk(disk-1,src,using,dest)
        printMove(disk,src,dest)
        moveDisk(disk-1,using,dest,src)
    
    count = 3
    
    moveDisk(count,'A','B','C')

Result:

    Move disk 1 from A to B
    Move disk 2 from A to C
    Move disk 1 from B to C
    Move disk 3 from A to B
    Move disk 1 from C to A
    Move disk 2 from C to B
    Move disk 1 from A to B

bitwize · on Jan 31, 2016

Duff's Device.

Much of the NetBSD code base.

Pick a hack by Oleg Kiselyov.

The Commodore KERNAL.

mitchtbaum · on Jan 31, 2016

Turbo's code seems very beautiful to me. Before finding this, I saw web servers as either a big-black-box, coded in a hard to grok systems language (Apache and Nginx), or as a testing tool written in a slow, but readable scripting language (Python or PHP). Now, I can dive into a fast, robust, and easy to read web server (and client) to see nearly every bit of functionality. Lua continues to impress and strike me as a great language for web development.

https://github.com/kernelsauce/turbo

jacquesm · on Jan 31, 2016

Hands down Peter Norvig's hashlife and sudoku solver.

vayeate · on Jan 31, 2016

I have limited exposure being a junior web dev, but I find Laravel to be very well structured and what some might describe as Elegant. Jeffrey Way discussed his opinions on it that I agree with last year at a Laracon: https://youtu.be/mDotS5BDqRM?t=1539

noir_lord · on Feb 1, 2016

Laravel is elegant, the Doctrine project is as well.

Symfony is clean and modular but because of its scope not as Elegant, the code is excellent its just very large and does a lot of things.

thiht · on Jan 31, 2016

Although I'm not much of a C guy, I really enjoy reading ANSI C. I like the Lua source code[1] for instance.

[1]: https://github.com/lua/lua/tree/master/src

networked · on Feb 1, 2016

In the sense that perfection is achieved when there is nothing left to take away, GO.COM as described in http://peetm.com/blog/?p=55. It doubles as a winning entry in 1994's IOCCC.

kpil · on Jan 31, 2016

Although not 'real' code, almost anything from The Wizard Book is either nice, elegant or sometimes a minor epiphany for a C-damaged mind...

Probably not that efficient in terms of raw CPU grunt though...

nine_k · on Jan 31, 2016

Lisp interpreter loop.

contingencies · on Jan 31, 2016

exit()

camperman · on Jan 31, 2016

A quiz question from the old Imphobia demo diskmag: a single instruction to extend a byte value in al across a full 32-bit word. Impossible right? Nope:

    imul 0x01010101

throwaway420 · on Feb 2, 2016

Some people might hate this answer, but sometimes the most elegant piece of code doesn't actually exist.

Code is a tool that is used to solve problems. Sometimes the most elegant solution to a problem doesn't involve writing code, but involves examining a business process and changing the work flow a bit to avoid needing to code something...or rewording an item in the user interface so that certain code isn't anymore needed.

Sometimes rather than being a drone and just churning out whatever code you're told, the bravest and most elegant solution to a problem is figuring out a way to avoid it altogether.

This is the hardworking lazy programmer's most elegant code.

irascible · on Jan 31, 2016

duffs device.

mojojojojo · on Jan 31, 2016

[flagged]

EllipticCurve · on Jan 31, 2016

No.

NoNoNo

jcoffland · on Jan 31, 2016

Perl 6......JK of course.

jcoffland · on Feb 1, 2016

It was worth it.

mojojojojo · on Jan 31, 2016

Perl is amazing.

jcoffland · on Feb 1, 2016

Have you ever read the code? I actually think preprocessor macros can be quite useful but Perl dials it up to 11.