Hacker News new | past | comments | ask | show | jobs | submit login
"I think that djb redo will turn out to be the Git of build systems." (pozorvlak.livejournal.com)
147 points by jerf on Jan 14, 2011 | hide | past | favorite | 49 comments



This essentially turns "make" inside out. The makefile lists the dependencies and then has minimal glue to run the build command based on the dependencies. Instead, redo has the build commands invoke helper-tools to specify the dependencies.

Overall, I quite like it. One downside of the current design that I see as being somewhat hard to fix is that it doesn't handle build commands with multiple outputs. There are klugey workarounds, but they require either lying to redo about the true dependencies (which always fills me with dread), or making parallel build unreliable.


Those downsides sound massively important to fix.


I've been agitating on the mailing list. True cases of this seem to be fairly rare, and I have come up with a slightly more effective (but ridiculous) workaround: have the multiple target command create an archive of its output, and then each actual target depends on the archive and extracts itself from the archive.


There is yacc, as a fairly common standard case, and I find that quite a few of my homegrown tools for little languages fall into this category (e.g. translating a table into a header file, a source file, and a ruby script).


Bernstein's work is really understated. On his site you can see a better glimpse in what he's worked on: http://cr.yp.to/djb.html


I've always found something about djb a bit offputting. On the otherhand I have huge respect for the fact that when he sees something that he thinks is crap, he doesn't just bitch and moan, he makes an alternative (he bitches and moans as well of course).


I agree with the blog, in that it seems disappointing that it doesn't integrate fabricate.py/memoize.py automatic-dependency-detection via strace. Those projects seem to advance the state of the art a lot more as a make-replacement. The git analogy is great though; while in general I think there should be a very high bar for throwing away working systems like make, in this case there's a potentially worthwhile set of different ideas: hashing for change detection, strace for dependency detection, and using scripts/whatever instead of a make-specific DSL.


You might be interested in tup (http://gittup.org/tup) in that case. It finds dependencies by preloading a library that tracks all the file accesses that a program makes. IMHO I think this is a nicer approach than scanning strace output.


Thanks - great link. Certainly much faster than strace!


I agree, I feel like redo isn't going far enough. I'm not sure if I'm missing something, but semantically this is very similar to make, even though it looks like it is a better syntax.

Of course, I'm completely biased :) I have my own build system called fbuild (http://github.com/erickt/fbuild) that's conceptually similar to fabricate/memoize. There's also mem (http://srp.github.com/mem/getting-started.html) and wonderbuild (http://retropaganda.info/~bohan/work/sf/psycle/branches/boha... -- although that link appears to be broken at the moment) if you're interested in some other projects that have similar roots.

I believe the key insight that we all independently made is that we can use the procedural execution stack for the dependency tree, and cache the function arguments and results to avoid work duplication. I believe this helps someone reason about what's going on in the build system compared to the declarative make systems because you can easily trace the inputs and outputs of some dependency.

Furthermore, you can do some pretty interesting things with this model if you push this idea. It's pretty simple to do integrated configuration and building. For instance in fbuild, I can do (fbuild uses python3 for it's host language):

    import fbuild.builders.c
    import fbuild.config.c.linux

    def build(ctx):
        # make a c compiler.
        builder = fbuild.builders.c.guess_static(ctx)

        # check if the system supports epoll
        if fbuild.config.c.linux.sys_epoll_h(builder).header:
            use_epoll = True
        else:
            use_epoll = False

        builder.build_exe('foo', ['foo.c'],
            defines=['USE_EPOLL' if use_epoll else 'USE_SELECT'])

I don't think many of the new build systems have looked into the configuration issue yet, which is a shame. I'd love to get some more ideas out there.


That can be handled entirely as a separate process that creates the dependencies that redo then uses. Separation of concerns argues that here's no need to tie them together.

In fact, they have an example of using gcc to generate header-file dependencies for C. A command-line tool that ran a command and spat out what files it accessed (via strace, dynamic library imposition, or even crazier methods like interpreting the binary) would hook into this system fairly seamlessly.


I really like the idea of strace for dependency detection. Unfortunately some people still use Windows which means I have to build on Windows. Does anyone know of an equivalent to strace one could use for dependency detection?


how does automatic dependency detection using strace work?



Maybe you get strace output from a running program and it uses that to evaluate what dependencies/versions you need?


how does that fit into a build system? why would you require inspection of a running program to resolve build-time dependencies? and moreover, it's not like running a binary requires it to open all of the header and source files needed at build time, so how would you glean any useful data from it?

and what about systems that don't have strace, like BSD or OSX or solaris or windows? there's dtrace on those platforms (minus windows), which requires root, and requiring root access for dependency resolution isn't exactly a great idea.


It sounds like you don't strace your project, you strace the compilers and linkers to find out which headers and libraries they actually use. Then you know exactly which builds are possibly out of date at any point, even after a platform upgrade. Nobody bothers to note stuff like libc in their makefiles, a mistake which make can't help you avoid.


I think reading strace's output was a great experiment, but I don't think it's really that necessary. Most programs have some way of spitting out the dependencies, and some others follow pretty standard rules for file generation. For instance, ocaml optional interface files generate a .cmi file from a .mli file, and implementation files generate a .cmo file from a .ml file. However, if no .mli file exists, compiling the .ml file will also generate a .cmi file.

It can be a pain, but it's really not that hard to handle if your build system provides a way to abstract out build patterns.


FreeBSD has kernel DTrace (off by default), with userspace tracing on its way (in CURRENT). ktrace is a better option there, especially if you want to support other BSD systems as well.

I've used ktrace to help create minimal jail environments, quite handy: http://www.aagh.net/files/mkjail


So basically if you don't completely exercise your program while creating your build system, your build system won't know about all your dependencies?

Suddenly make is starting to seem pretty nice...


We're talking about build dependencies, not runtime dependencies. A build dependency is - by definition - one that is touched during the build process. strace exactly identifies that set.


This reminds me a lot of CDE Pack, can you explain how this is different?

If my program needs to be dynamically linked with, say, OpenSSL, but rarely actually uses it, how does introducing strace into this situation help?


i don't see how... if my code requires foo.h and bar.c, but those get compiled and linked into one executable. how will running that program expose that?

moreover, how can you run strace to figure out dependencies on a program that hasn't been built yet? how does it determine when new dependencies are added?


You strace the build steps (gcc bar.c), not the output program


you don't care about a dependencies of an non-existing target. you have to run it anyway. and once you run it at least once you will have the full set of its dependencies for the next time you run your build.


Does the strace approach allow parallelizing the build?


I suspect you'd have to repeat a task if you discovered that it was dependent on another task's output. You would still discover this though, so your final output would still be correct.


Here is a link to the actual project:

https://github.com/apenwarr/redo#readme


Reading the title, I thought djb had implemented a build system and I was extremely excited. Still quite interested, but it doesn't have nearly as much allure.


He didn't implement it, but he designed it in quite some detail, as far as I understand.


I urge you to look at the link to djb redo posted in the beginning of the article. It's quite a teaser: http://apenwarr.ca/log/?m=201012#14


Yes, there's code. I had to choose between linking this or following further down the chain and thought this actually did add something, in that while I could have gone to apenwarr's post directly or linked the the design doc it would have sunk without a trace.


Yes, and I thank you for linking the way you did! I was mostly commenting on djb's reputation for creating extremely robust software, and that it's difficult to refer to this without implying that djb implemented it. However, I'm sure that apenwarr is also an excellent programmer, and as a regular abuser of Make, I'm going to give his redo implementation a shot, especially since djb's design fits my use case much better.


I would agree that Apenwarr is an excellent program. I've been using his sshuttle for some time now.

https://github.com/apenwarr/sshuttle


I was enjoying the README and then I hit the part about using hashes to determine staleness, which is a good idea, but in the Python implementation those hashes are stored in a sqlite database. That seems a little excessive for a lightweight build system.

This is something that seems like it could be handled by the file system:

> $ cat .redo/artifact-source/src/path/to/file.c > f572d396fae9206628714fb2ce00f72e94f2258f

Or the reverse (hash to source path).


As for the problem of cross-platform portability with new make-tools, and the complaint about having to learn a new language with make-replacements (two complaints raised in the posts comments), it might be possible to have a simple shell-script build the build-system, which then builds your code.

I made such a thing for C++ years ago. It adds a header and a footer to your pure C++ build-script, compiles it, and then runs it (compiling your program). It also included extension libraries that did some more advanced things, such as detect local dependencies, but still was very toy-like and unpolished. I never advertised it widely, especially after I started to use Ruby instead of C++ for the project it was related to.

Still, the basic idea of including the build-system in source-form with ones code might be interesting to some people... at least makes it easy to fix, deliver, and extend.

http://sourceforge.net/project/shownotes.php?release_id=3730... (unix / linux only)


Anyone know how redo compares to SCons/CMake/Ant?


- hundreds of times simpler (literally, if you count source code lines) - much easier to use in the sense that if you know the compile command you need to use (or the shell command or whatever), you already know how to use redo. - strictly a build system (CMake and SCons also do autoconf stuff)


The .do files are not even remotely readable. Try scons. At least you can understand what it's doing even if you know little about it with just a glance. None of this a.c.c.c.c,$1 gibberish.


IMO, build system for daily work has two very important properties: correctness and speed. Scons seems to get correctness right, but fails to be fast enough (see e.g. http://gamesfromwithin.com/bad-news-for-scons-fans)

Now, there is a third important property, which is clarity. But clarity for a new-comer is less important than a clarity for a person that uses build system daily.

I investigated several alternatives to make for our C++ game framework and settled to Waf. It's quite complex and side-effect of that has caused that we haven't integrated many of our tools to build system, just because doing so requires deep understanding of Waf model. Which I haven't acquired, well, mainly because of laziness.

Thus, clarity can affect both correctness and speed in practical situations of lazy people like myself.

What I like about redo is the simplicity. Based on my initial experiences, it seems that aside multiple output files problem, it doesn't get in to your way.


Interesting that git and redo share the problem of providing POSIX utilities on Windows. Interesting also that nobody seems to have suggested a merge between the two. If redo is that small, would it not be feasible to package it as git-redo and not solve the same problem twice?


git and redo have nothing in common, except for being a utility developers may use.

If nothing else, they should never be merged because they are trying to provide a POSIX utility, which should always be: "Do one thing and do it well"


Interesting thought, except the article specifically says that git does NOT have this problem.


That's because there is nothing about git that requires you to implement it with sh, that was just the way it was written. `redo' has you literally writing shell scripts to describe the construction process, which means you either need sh, or you need to write entirely new scripts for running on Windows


redo has the same problem of git because both are supposed to be used in as a small part of the UNIX environment. In this regard, it is not different from make.


> just learned the basics because nontrivial version-control tasks just got so complex, so fast, it was usually quicker to go outside the system or even redo your work. Even thinking about what was going on was hard.

Is that really true? If so, that's pathetic and my respect for my fellow developers as largely competent professionals is misplaced.


In the old days (CVS), in some circumstances it was easier to avoid CVS than actually use it. CVS had a lot of warts that people generally just worked around.


I guess you never tried renaming a file in CVS?


It's called humility. You might look it up in the dictionary. While you're at it, check narcissism.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: