This brings back memories of an ancient program called "patchy" that I encountered during my PhD.
Fun story, I was one of the last students doing his PhD at the Tevatron, a large particle accelerator near Chicago, before everybody's attention moved to the LHC. Our collaboration was winding down, and close to the end of my thesis I realized that there was nobody producing the simulations that I needed anymore, so I had do it all myself. The first step was building the software. One program I used has been maintained by the same guy since the 70s, and you could see the accretion of layer upon layer as new scientific models got added to the program, but it was never rewritten. The code itself was in glorious FORTRAN 77, and to compile it you needed the aforementioned patchy. A patchy file is a plain text archive of all your source files much like in OPs stamp. In addition, you can add directives to each file to do a primitive form of conditional compilation (e.g. to include certain models, or to run on AIX).
I really wanted to use this one program because it could simulate things no other program could do. But the biggest hurdle was actually compiling patchy, which required a specific version of CentOS, CERNLIB, two decades worth of patches on top, and a crazy bootstrapping procedure. I especially recall a manual for patchy, which proudly talked about laying a software foundation for the soon to be built "superconducting super collider" in Texas (which was cancelled while I was still in elementary school). The episode made me realize how deep the stacks are sometimes behind the legacy software that we use.
That allows a text file to contain several other files. Plain text files can be included as-is (as text/plain), and binaries can be included (as application/octet-stream). Each file can be given a name through 'Content-Disposition: filename="foo.txt"'.
It can be hand-edited if you're reasonably careful. (Use boundaries, not Content-Length.)
And if any of the files would have been mergeable (with Git, etc.) as a separate file, it should also be pretty mergeable as a section of a MIME file. (Because it is included literally, and because presumably you give each one a unique filename, so the merge/diff algorithm has a unique Content-Disposition line to work with for each file.)
Cool seems like unix `cat` command with addition of storing the filepath (so maybe like `find`?)
While this of course sound like a cool project and my comment might be the most Hacker News-y thing [1] I think it's aimed at developers and so it fits.
I don't agree with the zip downsides listed in the repo since firstly `unzip -l` exists to see what's in a zip and secondly the argument with reuploading the zip doesn't work in favor of a different format since you also need to update it in a remote when changed.
Secondly I am kind of amused about this format when `touch`, `mkdir -p` and `echo` are available in every POSIX compliant system which can be combined to create a nice coherent shell script which without dependencies would cover the functionality of this project as far as I understand.
I don't want to sound condescending but I'm a feeling a little left-pad [2] on this one.
I see this as a declarative vs imperative approach. The interface is a bit barebones and heavy right now by needing to fire up node and run the functions. But like declarative vs imperative DevOps, the declarative nature of the resulting format seems like it can carry a lot of the same benefits.
POSIX scripting and other tools are a viable alternative as well. Though despite knowing all of the commands you’ve mentioned, and regularly sharing/applying git patches, I think I would still have difficulty renaming FileA to FileB in a tarball or patch prior to unpacking or applying it.
The declarative approach seems like a nice feature that greatly reduces the cognitive friction there. But as a single feature, it’s not clear whether it’s worth a tool change.
I use this pattern a lot along with a tool I built for doing server deployments and administration using plain old shell scripts and ssh (golem: https://github.com/robsheldon/golem/).
There are two caveats:
First, if there's any chance at all that the heredoc may contain a $, or a `, or possibly some other shell-magical characters, then you have to use a single-quoted heredoc:
cat <<'EOF'...
This means that if you want to do variable interpolation, like you're doing, then you need something that looks like:
cat <<'EOF' | sed -e "s/\\\$username/$username/g" -e "s/\\\$my_email/$my_email/g" ...
It looks yucky and unwieldy at first, but I've found that it's nice to be able to see at the top of the heredoc exactly what's getting replaced and what values it needs.
Second, if root privileges are required to write the file, then you need to use `tee`, because you can't sudo an output redirection:
cat <<'EOF' | sed -e "s/\\\$username/$username/g" | sudo tee /path/to/file >/dev/null
After using it for a while, I've found I really like this pattern for managing configuration file templates.
You're right though that something like Stamp could be built using standard shell tools if someone were so inclined.
shar is specifically mentioned in the footer, with the drawback that it's really an arbitrary shell scipt. ptar is also mentioned, and seems rather nice. It's also way better documented[0] though with glaring holes e.g. how does it deal with non-UTF8 file content (it's not clear whether file size or delimiters is relevant, and why you'd have a closing delimiter if the filesize rules), it also specifies file names as UTF8 or ASCII, neither of which is sufficient to handle the full breadth of possible file names.
I guess that's true. I suspect the support for non-UTF8 names in modern tooling is very, very spotty, given how many config files and file formats that refer to other files use UTF-8 themselves. E.g. can you refer to one of these names in an nginx config? (just an example; I have no idea if its config is UTF-8 or not)
The main benefit, in my mind, over zip/tar is the built-in parameter substitution.
You could imagine using this as a development dependency, to standardize the creation of new (anything) that follows a predictable pattern. Check in your stamp file, and any junior dev who creates a new [anything] in your project can get all the custom boilerplate and best-practices right away.
Yes, the only real advantage over zip files seems to be the parameter substitution. I’d rather build that on top of the zip format, which supports extension by custom fields, and provide a wrapper around zip/unzip that adds the substitution functionality. Users of regular zip could still use the zip files then, just without automatic substitution.
Zip unpackers usually spill files into already populated directories or create double foldered contents, depending on how you prefer to unpack them: <here> or <into zipname>. The only app that does the right thing is The Unarchiver from AppStore. All other unpackers on all platforms force you to look into the zip file before unpacking. This is annoying AF.
I can’t tell from the article, but if it allowed “unstamp -“ and then simply copy-pasting into stdin from a site, that would be great.
I think the main takeaway over zip is that with zip you need to actually run “zip” before committing whereas here you can declare it in “code”. I don’t see the benefit over shipping a skeleton directory though.
Why don’t you write it with posix only tools then? It’s a great idea that didn’t exist yet. Who cares how the developer made it? I’m sure if it takes off people will rewrite it in go and rust with standalone binaries and the HN crowd will love it. I’m pretty sure people use the tools they’re most familiar with. Bashing people for their choice of tool is a popular comment on HN but doesn’t accomplish anything.
It's not bashing a tool per se, sorry if you read it as such. I was pointing out that with the abundance of tools we have at our disposal this project sounds to me like a solution looking for a problem.
As for the solution not existing yet - as people pointed out there is tar, zip, git, and shar which all seem to accomplish the same thing
Yes I read it and I don't agree with the posted downsides for most of them, since the tool itself seems to share the downsides with the posted pro-cons of other solutions and doesn't seem to solve any problem it promises to do.
Is visibility into what's being created a problem?
The listed problems like "everytime the template changes, author has to rezip and reupload the folder" don't seem to be solved with stamp to me, it seems like the problem is now shifted to controlling a version of this new format and still distributing it somehow (probably through git) at which point you ask yourself why can't the git repo already have the structure? If somebody wants to rename something they do this really easily, no need for var substitution.
If there was a migration mechanism that would move files from an old template to a new one I would see value added.
Regarding the project templates, did you look at Cookiecutter [0]? There are over 5k projects that GitHub finds that use it [1]. It solves the “custom utility” problem, since it can be used with multiple templates. I can just install `cookiecutter` and immediately use any of the 5k+ templates, and I get variable substitution, and some basic logic in the code. I also don’t need any special structures — I prefer 4-space tabs, and to write or edit the `stamp` file manually, I would need to remember how to tell vim to disable all indentation support, since I’d have tabstops at 1 (the `data` line), 2 (top-level code), 6 (one level of indentation), 10, …
Cookiecutter is nice but it requires an entire python install to run, which is a big thing to ask for some of the scenarios mentioned by the tool creator (like someone going through a simple learning tutorial which might not even be using python at all).
IMHO gomplate is a nicer alternative that's just a single static go-based tool that can do everything cookiecutter does and a lot more: https://github.com/hairyhenderson/gomplate
All Linux distros ship with Python. macOS installs it with the developer tools (which you’re likely to need anyway). Windows makes it very easy to install (type `python` in cmd, you’ll get sent to the Microsoft Store).
I don't think I've seen that! Thanks! Happy to accept a PR with a link to it.
The "template package manager" sounds very cool. Perhaps Stamp as a format for the CookieCutter Package manager would be a good combo. Not sure haven't taken a close look, just thinking out loud.
As someone who does a lot a "roll your own" programming as a scientist, I never got the appeal of overly nested directories in web dev and other fields, it feels a lot like it make things more complicated at least for small projects like the very example they show here. Might as well just have a single directory for three files.
Reminds me a lot like OOP examples in tutorials that make a class that only has two methods, too much boilerplate. Like OOP it becomes useful for large systems (GUI libraries) but it's overkill when you have less than ten files I think.
EDIT: too late to edit but what I mean in the end of the last sentence is directory structure, not OOP, that less than ten files can be in a single dir without confusing people (or me at least).
> ...I never got the appeal of overly nested directories in web dev and other fields...
This is often a side effect of many competing interests trying to cram their incentives into a single ontology and then taxonomy. No strong consensus (or a lack of highly opinionated direction) to shape a single concise ontology and taxonomic implementation, so Design By Committee creeps into the decision-making, and creates enough branching to satisfy the "gotta catch them all"-itis to capture all the requirements.
That in turn is usually a side effect of the business not knowing its domain sufficiently well to articulate to a granular-enough detail the implementation priorities, and improperly punting that decision to the development organization. "We need this requirement for sure that will as a result prioritize the domain under this ontology. No, wait, we need that. No, wait, we need both even though they are overlapping but mostly disjoint ontologies, and in some places contradictory...."
Which finally, is usually in turn a side effect of people leaders managing by KPI's instead of leading teams, to produce results that happen to affect KPI's as a happy side effect. Which is probably where the "leadership that comes up through the ranks is best leadership" comes from, because we often do not find leaders who embody that "lead the teams not the KPI's" characteristic without the deeply-internalized knowledge of the domain acquired by such arduous, time-consuming work over years and often decades within the domain.
It's a tough abstraction stack to deal with for everyone involved. There are good solutions, just not quick and cheap ones of course.
Debugging text is easier and the bulk of the data is binary anyway, so there would be no great file savings with a "pure" binary protocol. That's the same rationale for HTML and text-based internet protocols -- they are easier to write and debug.
This is a particular concern when you're talking about a file format that has to work on wide variety of architectures and operating systems. Having a common ASCII encoding makes it significantly easier to build an interoperable file format.
I remember telnet-based installers back in the early 1990s that used the pipe-to-sh truck along with shar archives to install things like IRC clients onto your workstation.
Expanding a bit this would be great in an IDE, Often I have wanted to simply select a bunch of files and edit them at once (maybe with a special temporary comment between them) as a single file then save the changes back to the individual files. Would be nice if that was a right click context menu item or bindable command.
In C(++) land I could see it being extremely useful for h/c(pp) juggling if the IDE automagically did the combining.
Finally being able to write the comments that separate the files as I type would be really nice when prototyping out new code you want all
together before you split it up.
Also perhaps if its common enough certain meta files that describe link other files together could be left as shortcuts in the repo. When you open those the files (or portions of files) they reference are opened for editing.
I'm really surprised that idea hasn't been doubled down on in IDEs. I still use a handy little Mac App called Rename It to edit a bunch of file names at once, because DirEd I find a bit clunky. I've taken some stabs at building this kind of editor, but it's definitely hard to nail. At least for the Tree Notation web ides we can get it added eventually, perhaps it will be easier now that CodeMirror 6 is out (which I use for the fancy syntax highlighting and all that in the current Language builder IDE).
But I 100% agree with your ideas here. If something like this (either as one scrollable buffer, or perhaps a bunch of buffers in a 2-D spreadsheet like interface), ends up being the primary IDE view in 10 years, I would not be surprised.
Emacs + Org-mode can sort of do this using org-babel. A lot of people use it for literate programming, or even just for documenting their own emacs configuration.
I'm not sure if there's already a plugin or script that can take a selection of files like you mentioned, but it shouldn't be hard to write one either.
It would be interesting to generate HTML as an output. This way it could inline text files, images (data URI), videos, audio and other files as <a href="data:application/octet-stream;base64,... download="filename.ext" ... You could get mime types from file(1) not to put octet-stream everywhere.
Basic version wouldn't even need a single JS line.
Tree notation looks fun... I was reading what I think is the spec (https://github.com/treenotation/blog.treenotation.org/blob/m...)? I honestly can't make quite heads or tails of it, but I do get an sense that giving cells 2D size is important. Then I looked at the language examples and... none of them seem to really use this idea of cell size??
The beauty of this stuff as it gets going as all of these seemingly simple langs will be able to take advantage of the 2d and 3d stuff.
I always stretch myself thin, but I'm starting to see other people do really cool things with this concept of 2 and 3 dimensional languages, an area of research that has been really quiet for ~50 years. If anyone is interested I'm telling ya I still can't see the limit! :)_
Oh, this looks similar to my "motllo" project, [1] (and so many other projects, mine wasn't the first either). I have variable substitution, but no additional logic. For me the point was having a "readable" representation of the template.
Oh no need to, it's not that related and I think both stand out well on their own. As for the animated demo, it was with asciinema [1] (I think, it's usually what I have used in the past for this). Thanks for your good work!
I’m bemused to observe that even though you include a section describing alternatives, there are still the usual number of comments saying “why not just use x” in typical HN style.
I often want a simple file based templating system, and this is a nice example because it’s closer to being a declarative standard that could be reimplemented in various languages and for any platform.
Why should I use your tool vs. a one line bash script for creating a directory tree?
Why would I keep any file content in the stamp file, in the version control history, if I can keep, say a bootstrap repository with some "templates" and check them out without history (git archive, git checkout-index).
Its seems a kind of relation, where 'parent directory' can be any string e.g. a file name. Cool. But I'd much rather just have a relational way of looking at my data objects and metadata, without this ancient, obsolete notion of 'folders'.
I've watched OSs come and go. They all make about the same mistakes - including a file system as part of the OS (instead of just another service). Organizing it around directory-parent directory-file.
I'm just an old guy, moaning about how little OS technology has changed or grown in 30 years.
The trees vs graphs debate is one that constantly comes up.
I'm obviously a tree guy, but of the things that I am most uncertain of, this is one of them.
To me it seems when you dissect things, you can always make a tree, and trees are simpler, therefore everything is a tree.
However, trees don't come to life until you have motion/time, and I could see maybe how the graph is the ultimate data structure.
You can't dissect graphs into 2 dimensions like you can trees however, given the constraint that wires cannot cross. You can create models of graphs, but not arbitrary graphs themselves. Whereas there is no tree that you cannot create in 2 dimensions.
I never thought about using diff+patch as an alternative to tar, but I just tried this and it does work as expected:
mkdir dir1 dir2 dir3
for x in {1..99};do echo $x > dir1/$x.txt;date >> dir1/$x.txt;done
diff -urN dir2 dir1 > dirs.diff
cd dir3
patch -i ../dirs.diff -p1
After that, dir3 has the same contents as dir1 had. I couldn't figure out how to make diff consider all files in a directory as new though, without having an empty directory to compare it to.
I generally am against snark, but I think that in this case it communicated something quite useful about how we can go through the same thought process that resulted in a tool that has existed forever, and not realize we have just reinvented it.
Disclaimer: I've never used this tool, today is the first time I've heard about it.
Reading the shebang of a stamp file is a lot easier than scanning a bash file for any sneaky obfuscations (including any custom logic that the author may have "helpfully" included).
Alternately, the security-conscious can execute it directly as `node --use_strict /usr/local/bin/tree my.stamp`.
> Reading the shebang of a stamp file is a lot easier than scanning a bash file for any sneaky obfuscations
This.
However, the feedback in this thread is right.
Looking back over the past few years, I almost never used stamps as executables.
Generally I always use it as a library, or via something like `unstamp someStampToExpand.stamp`. So the shebangs and executability of stamp
files were stupid, and added a lot of complexity for almost no gain.
I've just gone ahead and removed them.
I think now it should be a little clearer how easy it is to write stamp
functions in other langs.
Fun story, I was one of the last students doing his PhD at the Tevatron, a large particle accelerator near Chicago, before everybody's attention moved to the LHC. Our collaboration was winding down, and close to the end of my thesis I realized that there was nobody producing the simulations that I needed anymore, so I had do it all myself. The first step was building the software. One program I used has been maintained by the same guy since the 70s, and you could see the accretion of layer upon layer as new scientific models got added to the program, but it was never rewritten. The code itself was in glorious FORTRAN 77, and to compile it you needed the aforementioned patchy. A patchy file is a plain text archive of all your source files much like in OPs stamp. In addition, you can add directives to each file to do a primitive form of conditional compilation (e.g. to include certain models, or to run on AIX).
I really wanted to use this one program because it could simulate things no other program could do. But the biggest hurdle was actually compiling patchy, which required a specific version of CentOS, CERNLIB, two decades worth of patches on top, and a crazy bootstrapping procedure. I especially recall a manual for patchy, which proudly talked about laying a software foundation for the soon to be built "superconducting super collider" in Texas (which was cancelled while I was still in elementary school). The episode made me realize how deep the stacks are sometimes behind the legacy software that we use.