Hacker News new | past | comments | ask | show | jobs | submit login

From the wiki for [Filesets](https://github.com/boot-clj/boot/wiki/Filesets)

"Build workflows necessarily involve the filesystem because it's not practical to send file contents around in memory as function arguments, and because we want to be able to leverage existing JVM tooling that generally operates on things in the class path and not on in-memory data structures."

so like unix pipes? What exactly is impractical about passing big values as function arguments? Isn't that how gulp works?




Well, there are two things there: the unbounded potential size of files and the ability to coexist in the current JVM ecosystem without rewriting every bit of code that looks for things on the class path.

I think that generally the files you're dealing with will fit in memory, but you still wouldn't probably want to have everything in memory all the time, you'd need to stream them (consider an uberjar that's 10M as a JAR, but might contain contents that are 10x that size uncompressed).

This brings us to the second point. No existing JVM tools are prepared to accept files via pipes. Tooling on the JVM looks for things on the class path and in JAR files. We definitely don't want to have to rewrite every tool to use some in-memory pipe system.

The fileset and pod abstractions provided by boot work together to give you the best of both worlds--an immutable representation of the filesystem and class path combined with full interop with the existing JVM ecosystem.

Another issue with Unix pipes is that they are not awesome for tree-like things. In bash you pipe lines of text from one program to another, but tree processing is very difficult. Consider xml processing by pipes, or my jsawk library for processing JSON in the shell. These suffer from memory issues because it's hard to stream trees (if you have a 2G JSON file that is a single object jsawk can't even read it in).


Actually I should mention that perhaps the wording in the passage you quoted is misleading. We don't send the actual file contents around in memory as function arguments, but the fileset object itself _is_ passed in memory as an argument, and this fileset object contains the metadata you need to perform your work. When you need the contents of the file there are protocol methods for obtaining the underling file contents, and the actual underlying files are completely anonymous temp files of no special significance (their specific locations in the filesystem are irrelevant to the build process).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: