Hacker News new | past | comments | ask | show | jobs | submit login
“Make” as a static site generator (2022) (karl.berlin)
285 points by bundie on Sept 10, 2023 | hide | past | favorite | 157 comments



My personal website (https://pablo.rauzy.name/) used to be generated using a simple Makefile.

Then I added features like news and an RSS feed, a way to automatically list my research publications and course materials, a list of books filterable with tags, etc. So now it still is a Makefile but the Makefile itself is a bit simpler than it used to be, but it calls a few Bash scripts that in particular make use of the awesome xml2 and 2xml utilities to be able to manipulate HTML in a line-oriented manner using the core utils (grep and sed mostly).

On top of that I have a few git hooks that call make automatically when needed, in particular on the remote server where the website is hosted so that the public version is rebuilt when I push updates the repository there.

It's been working like a charm for years! My git history goes back to 2009.

EDIT: I just had a look at the first commits…

    beccad7 (FIRST_VERSION) Initial commit
    d1cc6d7 adding link to Google Reader shared items
    6ccfd0c fix typo
    d337959 adding link to Identi.ca account
… 15 years have passed indeed.



One of the nifty parts about having a static site generated offline from trusted inputs is that it doesn't matter whether the generator components are "abandoned" or complete.


Ehhh... assuming their dependencies and your operating system maintain compatibility with it on the order of decades, yes. Which do exist, but they're understandably rare.

And it's complicated by this only being knowable in retrospect, as you can't predict the future. "Not abandoned" is a positive sign for "if we failed to predict the future correctly, it'll be fixed", rather than mostly relying on luck.

(Thankfully full-blown simulation is often an option nowadays too)


> (Thankfully full-blown simulation is often an option nowadays too)

Yes, and I think its fair to assume that some backend to execute the x86-64 Linux ABI will out-live most readers.

Projects like https://justine.lol/ape.html, https://guix.gnu.org/manual/en/html_node/Invoking-guix-pack.... or https://github.com/matthewbauer/nix-bundle do make it approachable to "bundle" a lot of software down to libc.


Take a rootfs snapshot, have it available as a container.


I release all I need (including data) inside a docker container.

Running my website with 2 containers, one being the webserver and the other one the data container.

So only the registry need to be backuped.


s/abandoned/crystalized/g


FFS not everything needs constant churn.


If converting markup to/from line format is your thing to put awk, perl, and other line-oriented tools to use, there's also the ESIS format understood by traditional SGML tools and used by SGML formal test suites even.


Maybe it’s just mature? Sometimes projects actually become “finished” and don’t need any updating.


Upstream source repo is gone, there are a couple of stale GitHub mirrors/forks - and the package lacks a maintainer in Debian:

https://tracker.debian.org/pkg/xml2

And eg Arch gets the code from Debian as far as I can tell:

https://aur.archlinux.org/packages/xml2

I don't really expect there to be lots of new features - but bug fixes seem likely even for a mature utility.


A problem with this approach is that deleting a file from source/ does not delete it from build/.

In my own projects, simply rebuilding the whole site is fast enough, so I opt to remove the whole build folder before a rebuild:

https://github.com/jez/jez.github.io/blob/source/Makefile#L1...

This defeats a big part of why you’d want a build system in the first place (incremental builds), but at least if you know the page you want to regenerate you can still `make` that file directly.

If there’s a common workaround for this pattern in makefiles I’d love to learn it.


Not sure if it’s a common pattern, but my solution to this was to always run a command that deletes all “unexpected” files, using GNU Make’s “shell” function to enumerate files and the “filter-out” function to filter out “expected” outputs. Edit: I ensure this runs every time using an ugly hack: running the command as part of variable expansion via the “shell” function.

Edit to link my Makefile: https://github.com/jaredkrinke/make-blog/blob/main/Makefile


File deletions and renames are common problems with many revision control / build systems.

Other than the nuclear option ("make clean"), another is to have a specific rename / remove make target, so:

  make rm sourcefile
or

  make mv sourcefile newsourcefile
... which will handle the deletion and renaming of both the original and generated targets.

In practice for even fairly large blog and online projects, a make clean / make all cycle is reasonably quick (seconds, perhaps minutes), and is often called for when revising templates or other elements of the site design. If you're operating at a scale where rebuild time is a concern, you probably want to be using an actual CMS (content management system) in which source is managed in a database and generated dynamically on client access.


> If there’s a common workaround for this pattern in makefiles I’d love to learn it.

"make clean"?


How does that solve the problem? That forces a total rebuild, which is exactly what he said he didn't want.


Yes, I didn't read properly.

I guess you could do some magic to delete "unexpected" files, but are there tools which do solve this problem?


The cleanest way to do it is essentially "make install". You do all the heavy build steps into a build directory, and then the final stage is to delete the "output" directory and copy all the files you need there. Incremental builds should still be pretty fast since the only repeated action is copying files (and you could link them if you want instead).


Conventionally, install puts the outputs in their where they will live for use. It does so in a way that 'make uninstall' will leave things as they were before install. The install target should also run any pre- and post-install commands. There's also a 'make dist' convention, to build a release tarball.


This is the way, because intermediate build artefacts also end up in `build/`. You don't want those in your `output/` directory, but you also don't want to delete them because they help speed up the incremental builds.

Edit: `make install` also protects you against broken builds breaking your live site.


Something like this should do the trick:

  rm/%.html:
    @rm -f source/%.html build/%.html
Run with:

  $ make rm/page.html


The shake build system (a general-purpose build system similar-to/better-than make) has a "prune" feature for exactly this purpose:

http://neilmitchell.blogspot.com/2015/04/cleaning-stale-file...

But I think the best solution (that also works with make) is to have a "make dist" target that creates a final .tar.gz archive of the result. If the rule is written properly then it won't contain any stale files. The disadvantage is for large project it may be slow, but you are not supposed to use this rule during development (where it is useless anyway), only for releases (which still can be built incrementally -- only the final .tar.gz needs to be created from scratch)


Not sure if anyone actually uses it, but I would approach the problem with find, comm, and a sprinkle of sed:

    comm -23 <(find build -type f -iname "*.html" -printf "%P\n" | sed 's/\.html$//' | sort) \
             <(find source -type f -iname "*.md" -printf "%P\n" | sed 's/\.md$//' | sort)
The find commands get you a list of all the files (and only files - directories will have to be removed in a separate step) in each of the build and source folder, sed chops off the extension, while comm -23 compares them, printing only the files unique to the build folder, which you can then deal with as you see fit (e.g., by feeding them to xargs rm).


Using comm was exactly what I did use in my little experiment of a barebones SSG.

I did save a list of generated files and compared them. This one liner is the meat of the whole solution:

  comm -23 <(awk 'NR>1' "$DSTDIR/build-info") <(find "$SRCDIR" -name "*$EXT" -type f -printf "%P\n" | tee >(gen_index) | xargs -n1 "$0" "$SRCDIR" "$DSTDIR" | sort | tee -a "$DSTDIR/build-info.new") | (cd "$DSTDIR" && xargs rm)
Full source here: https://gist.github.com/hadrianw/060944011acfcadd889d937b960...


I think the best solution is to use something like webpack or vite or whatever. These usually have their own dev server and can watch directories for changes.

My personal site is also using a custom make-like ssg, but after spending a disproportionate amount of time writing the bundling/packaging code, I decided to just switch over to one of these tools. It’s a solved problem, and it greatly reduced the complexity of my site.


I use Nix, so I get incremental builds and your problem goes away.


You’re using Nix to drive your static site generation? If so, please share more details because that sounds intriguing!



Wow, I ask and a blog post just appears!

My initial reaction is: I should probably get around to learning about Flakes. I’m not sure I’d want each blog post to pin its templates, but it’s nice to have that choice.


A flake is a function, right? Invoking `nix build` resolves its inputs and computes its outputs. So if the static site is the output, wouldn't the content have to be one of the inputs?

Personally, I think I'd use two flakes, one that builds the content into something that's ready to hand-off to code, and a second one that turns it into a usable site. That way you wouldn't end up with a new input for each post, but instead would have a versioned something which represents your content all bundled together, and then the site builder consumes it as a single dependency--but conceptually it's the same.

I guess I'm just saying that it's not conventional, but it's a pretty logical conclusion to reach.


I was just reading this and thinking that `nix build` would do the same trick even more nicely.


I posted a link to the other comment, but here it is for you as well: https://juuso.dev/blogPosts/nix-as-a-static-site-generator/n...


I admire how nimble you are. I aspire to write blog posts at the drop of a hat like this, but I rarely do.

I also like the use of flake inputs for content.

It reminds me of a world that I've been imagining where the conclusions in scientific papers are generated as a flake outputs (an adjacent output would be the human readable thing, a PDF or whatever).

In this world, you can just run `nix flake update && nix build`, and if a paper that you cite published an update which invalidates your conclusion, you know right away because their output is your input, so your build fails.

We think about repeatable builds being for executable binaries, but they could equally well be for conclusions and assumptions.

Perhaps nix is too big of a hammer for the job, but it seems like the best shot we have at achieving this without also constraining the scientist re: tooling.

I realize that you don't want to be storing mountains of data in the nix store, but it would work just as well if the output in question is an IPFS CID, to be resolved during the build instead. The publisher can then be in charge of keeping that CID pinned and of notifying scientists when they're "build" starts failing.


> I admire how nimble you are. I aspire to write blog posts at the drop of a hat like this, but I rarely do.

Thanks! I took up blogging more often as of recent, and for me, having a manageable system is a large part of that. The last thing I want to happen on a Sunday evening is breaking some page of my website. That being said, I hope to one day make the workflow easier.

> It reminds me of a world that I've been imagining where the conclusions in scientific papers are generated as a flake outputs (and adjacent output would be the human readable thing, a PDF or whatever).

I happen to be a reviewer for software artifacts in a scientific journal, and I often use Nix here. Not that many projects do use it, but if I'm able to reproduce it with Nix, then I know the author has not missed any implicit dependencies. I like to imagine it's also useful for the authors as a feedback, whether they use Nix or not.

> I realize that you don't want to be storing mountains of data in the nix store, but it would work just as well if the output in question is an IPFS CID, to be resolved during the build instead.

I maintain separate build serves of my own using Nix integrations and the Nix cache is quite large already (so called remote builders) sitting at around 500GB. I host these at Hetzner.

I have also thought adding IPFS integration for my website, but haven't got around to it.


> I happen to be a reviewer for software artifacts in a scientific journal

That's very cool. I have a question for you.

I'm taking a bioinformatics class, despite not having the chemistry prerequisites. I'm getting a crash course in biochem, and the rest of the class benefits from having an expert in what-kind-of-quotes-to-use.

I've been thinking: would it be helpful if the care and maintenance of these compute environments wasn't left to each scientist but was instead aggregated (perhaps per-class or per-university)?

We're setting these chemists up with conda in Ubuntu in WSL in a terminal whose startup command activates the conda environment. Not exactly a recipe for reproducibility after they get a new laptop.

What if certain compute-heavy classes published flakes which the students could...

a) use while taking the class so we stop wasting time on troubleshooting ssl deps via conda

b) reference in publications after the fact. They could say:

> Here's a Jupyter notebook, download it and run it in the UCCS biochem environment like so: `nix run github:UCCS/CHEM4573?rev=16afd67`, its output lets us make the following conclusions...

I know it would be helpful for the students in the class. Do you think it would be helpful to them later on when they were publishing things?

I'm thinking about packaging the dependencies for this class, giving it to the teacher, and pitching it to the university:

> Set up a technical fellowes program. Waive tuition for us nerds and in exchange we'll support your students and faculty through the maintenance of these environments.

I don't mind paying tuition so much, but I'd like to do something to get a bit more cross pollination going between scientists in need of tech support and techies in need of something meaningful to work on.

Am I dreaming here, or would it solve some problems? Do you think I have a shot at convincing anybody?


Not sure on your issue with reproducibility using conda is. We (team of RSE working with many researchers) have had good success with storing conda environment files in git along side the code, only a few commands to get a working environment. We provide class room training to researchers and provide the training material and environments this way.


I don't have the link to the github issue handy, but I remember that the key was to ask for a lower version of python and then supply the `--force-reinstall` param. The fact that it only happened to some students was evidence that conda wasn't as hermetically sealed as nix.

My real gripe is that in that issue, the app developer couldn't really help--since it was a packaging problem--and the conda folks were unaware because the users had gone straight to the app developer. If there must be a third party doing curation, it seems to me that they should be more narrowly focused on whatever particular suite of tools enables whatever particular group of people--not on individual packages.

I know that conda lets users do this too, but I don't think that the environments compose as well as they do with nix. If you want Jim's envioronment, but with Susie's custom build of foo-tool, you can just take both as inputs, overwrite foo-tool as desired, and output composition. Your maintenance burden remains small. If conda handles environments with this kind of compositional attitude, I'm unaware of it.


> I don't have the link to the github issue handy, but I remember that the key was to ask for a lower version of python and then supply the `--force-reinstall` param. The fact that it only happened to some students was evidence that conda wasn't as hermetically sealed as nix.

Conda can be a bit fiddly, it makes life much easier if you specify the required version of python when you create a new environment.

>I know that conda lets users do this too, but I don't think that the environments compose as well as they do with nix.

Yes, though we find Conda great for most of our use cases, we still have to resort to creating containers.


Couple of thoughts:

> I've been thinking: would it be helpful if the care and maintenance of these compute environments wasn't left to each scientist but was instead aggregated (perhaps per-class or per-university)?

This is definitely something that Nix can abstract quite well. In my company we have [an infrastructure of computers](https://github.com/ponkila/homestaking-infra) that we manage with NixOS. We have gone over the system such that `cd` into a directory "conjures" the environment using devenv or direnv. We don't do anything too fancy yet, but we have a project commencing next month in which we start to also manage routers this way. We speculate that this will help us to do things such as follows: register new node, and it gets automatically imported by the router which establishes DNS entries and SSH keys for each user. The idea is that we could have different "views" of the infrastructure depending on the user which the router could control. For administrators, we have a separate UI created with React that pulls NixOS configuration declarations from a git repository (note: these don't have to be public) and shows how the nodes connect with each other. The UI is still under construction, but imagine this but now with more nodes: https://imgur.com/a/obBfRk0. We have this set up at https://homestakeros.com.

Depending on a project you are working on, you could then have a subset of the infrastructure be shown to the user and have things such as SSH aliases and other niceties set up on `cd` in. When you `cd` out, then your view is destroyed.

We have quite overengineered this approach -- we run the nodes from RAM. NixOS has the idea of "delete your darlings" which is having a temporary rootfs. We have gone the extra mile that we don't even store the OS on the computer, the computers boot via PXE and load the latest image from the router (though any HTTP server will do, I boot some from CloudFlare). We do this because it also forces the administrators to document changes that they do -- there is nothing worse than starting to call up people when theres is downtime and try to figure your way back up from what the mutations are. PXE booting establishes a working initial state for each node -- you just reboot the computer, and you are guaranteed to get into a working initial state. I'm personally big on this -- all my servers and even my laptop works like this. We upgrade servers by using kexec -- the NixOS configurations produce self-contained kexec scripts and ISO images for hypervisors (some stakeholders insist on running on Proxmox). I've suggested some kernel changes to NixOS which would allow boostrapping arbitrary size initial ramdisks, because otherwise you are limited to 2GB file size.

> We're setting these chemists up with conda in Ubuntu in WSL in a terminal whose startup command activates the conda environment. Not exactly setting them up for reproducibility if they ever move to a different laptop.

Python in specific is a PITA to setup with Nix, dream2nix etc., might help but it's definitely the hardest environment to set up of all languages I've tried -- even GPGPU environments are easier. Oftentimes, the only problem is not the packaging, but also the infrastructure used. For that, you could also publish the NixOS configurations and maybe distribute the kexec or ISO images.

A notable thing is that devenv also allows creation of containers from the devShell environment, which may further help your case. Researchers could reference docker images instead of insisting on everyone to use Nix.

In any case, I put some emails on my HN profile so we can also take the discussion off platform -- we are looking for test users for the holistic approach using PXE, and we are currently funded until Q3 next year.


Reading through your link I caught myself thinking if I would put up with all those boilerplate nix steps just to add a new page to the site.

Don't get me wrong, I get that you gain big amounts of flexibility out of it the way you do it but if we think about the tasksat hand, adding a page to a predefined blog, it seems a bit involved.


A fair comment, I do not disagree. I do plan to one day do an `ls` command on the root Nix file so that the manual update to the root flake for both the inputs and the RSS feed would be redundant.


I was instantly inspired by Karl's work on his "blog.sh" shell script[0] that he mentions in this article. I took it and tweaked it to create my own minimalist SSG called "barf"[1]. That wouldn't exist if Karl didn't share his wonderful work publicly!

[0]: https://github.com/karlb/karl.berlin/blob/master/blog.sh [1]: https://barf.bt.ht


Ah, a fellow person of culture. Mine is called shite [1], which makes my site [2]. The name alludes to the software quality :)

What I like most about it is I haven't had to upgrade anything, and don't expect to forever. And a close second; it "hot reloads" without javascript.

[1] https://github.com/adityaathalye/shite

[2] https://evalapply.org


Haha I sense a trend for these home grown static site generators :-)

Yours are much more advanced, but a few years back I made a minimal PHP static page generator and named it...

PHP keep It Stupid Simple, or in short P.I.S.S.

https://blog.nyman.re/2020/10/11/introducing-piss-a.html


Well, if you see my templating code, I've basically written PHP, but in Bash :D


Adding a pinch of m4 [1] can give you a bit more of flexibility while sticking with the same barebones approach.

I used to maintain a small website built like that some 20 years back. But I can't see the model working today, personal websites excluded. The problem is that the approach essentially enforces Web 1.0 roles: You either need every contributing user to be html-proficient, or someone willing to assume the drudgery of being the "webmaster".

[1] https://en.wikipedia.org/wiki/M4_(computer_language)


There is no such thing as a "pinch of m4". You start a clean project promising that you won't touch m4 this time. Then you add a small m4 invocation to save yourself from some boilerplate.

A year later, when you are trying to figure out why all instances of the word "cat" are silently disapearing from your website, you dig through 5 layers of macro expansions to discover that a junior dev tried implementing a for loop instead of copying it from the manual and messed up the quotation marks.

Having solved the immediate issue, you decide that debbuging your DSL is too hard, so you import M4 macro file you have been copying between projects. You then spend a day replacinf all usages of 'define' with your macro-creating-macro that adds comments to the output enabling your stacktrace generation script to work.

Next project, I am putting down a hard rule: no m4! (Except for maybe that one instance)


Please write more to this story


Not "this" story. Everything above has happened on several projects. The cat thing comes up because it is tricky to expand two macros next to each other without whitespace. So if you do:

  define(`foo',`hello')
  define(`bar',`world')
  foo bar
  foobar
You will get:

  hello world
  foobar
Working around this gets tricky, so someone inevitably ends up writing a cat-like macro such that you can do

  cat(foo,bar)
To get

  helloworld.
A side effect of this is that now "cat" is really "cat()" which expands to "". You can work around this by doing `cat'. However, if `cat' is used as an argument to another macro (such as a for loop), the quotation only prevents escaping the first time. When the for macro is expanded, the quotation marks are stripped, giving you just "cat", which gets expanded again. A correctly written for macro would add new quotes as needed, but I have never seen someone correctly write such a macro without just copying it.

Not sure if I have seen this interaction specifically with for and cat, but I have seen an interaction like it on almost every project that used m4.


You can place an empty "expansion" in that line to get the behavior you want without an additional cat-like function

    foo`'bar
I only know of this feature because recently read the manual page for m4, and it's mentioned rather early in there, but might have been not as well emphasized in past iterations of the manual.


For completions sake, though, the easy way to do it is:

    foo`'bar
The empty quotes make foo and bar separate words.


I've only ever used m4 via autoconf and sendmail configuration files, so I don't know if it's m4 that has the bizarre syntax or whether it's autoconf's and sendmail's use of it. I'm not sure I've ever tried to use m4 directly for anything.


That's why I wrote a "pinch" in the OC. m4 is arguably better than cat as a barebones template engine. The moment you start doing anything beyond simple includes and variable extrapolations, it is time to switch something more modern and robust.

I guess autoconf/sendmail still use m4 because there wasn't anything better at the time that doesn't come with a kitchensink attached.


I know that story too well. Finally, I thought that if I have to code, I should just use a programming language.

Now, I use nodeJS to replace every m4 file with mustache.js and some JS logic and I don't feel limited anymore. The complexity doesn't increase much.


Rather than relying on generic text substitution using m4 or perl or whatever, I suggest using SGML, the basis and common superset of HTML and XML, which comes with easy type checked text macro (entity) expansion for free or even type-aware parametric macro expansion. Where "type" refers to the regular content type of a markup element (ie. its allowed child elements and their expected order) but also considers expansion and escaping into attributes or other context such CDATA or RCDATA. Only SGML can extend to the case of properly expanding/escaping potentially malicious user comments with custom rules such as eg. allowing span-level markup but disallowing script elements, does markdown or other Wiki syntax expansion into HTML, can import external/syndicated HTML content, produce RSS and outlines for navigation, etc. Works well for nontrivial static site preparation tasks on the command-line; cf. linked tutorial and command line reference.

[1]: https://sgmljs.net/docs/producing-html-tutorial/producing-ht...

[2]: https://sgmljs.net/docs/sgmlproc-manual.html


What is sgmljs? There doesn’t seem to be any explanation on the site.


A comprehensive package for processing, converting, and serving SGML on the command line, on the server side, or in the browser; see [1]. Also features SGML DTDs (grammars) for W3C HTML 5, 5.1, 5.2, and Review Drafts January 2020 and 2023, which are the latest non-volatile W3C and WHATWG HTML recommendations/spec versions.

[1]: https://www.npmjs.com/package/sgml

Edit: your comment is a welcome reminder to improve the site, which isn't an easy thing to do however due to sheer volume of the material, even though it's using SGML for shared boilerplate inclusion, ToC/site nav and page nav generation, etc. (in fact, by calling sgmlproc from a Makefile)


Instead of `m4` or `sed` find and replace, the author should try `envsubst`. It's a program that replaces bash style variable references (for example `$TITLE`) with their value in the environment.

    export CURRENT="..."
    cat page.html | envsubt


I agree that `envsubst` is a good choice for this. Unfortunately, it is not part of posix, so you can't rely on it being present everywhere. But as part of gettext, it is still very common.


The problem is that the $SOMETHING syntax is just too common if your site is a technical one, and you'll end up substituting too much.


You can specify which variable names are valid, reducing the likelihood of a collision.


> a pinch of m4

nononononononononono for the love of everything please no

m4 isn't even a good esolang!


At the dawn of the age of PHP, I created a user management system (registration, verification, admin interface, …) that was based on well-established ideas (how login worked at Yahoo, Amazon, and every other process major site) but got no traction at all as a open source project. In any language that wasn’t PHP it would be necessary to write an “authentication module” which as about 50lines of cookie handling code. Multiple times I managed to out several existing apps together and make an advanced web site.

About 10 years ago the idea suddenly got traction once it was legitimized by the SAAS fad, I would tell people “don’t you know they’re going to end the free tier or go out of business or both?” and sure enough they did.

Anyhow, I bring it up because the system used M4 to interpolate variables into PHP, other languages, shell scripts, SQL scripts, etc.


Ugh, I know exactly how this feels. You resist the urge so hard to say “I told you so” and instead relish in the fact that you saw it. “The Way”, so to speak.

I remember having to write cgi cookie handling code. I remember having to write session-cookie sync code. PHP was a small slice of heaven in the cgi world. Until it wasn’t. Still, being able to import libraries of script functions without having to recompile was wizardry. The problem with php now is they let a certain product somewhat dictate their direction. Class namespaces with slashes is the ugliest design choice.

What was your oss project that couldn’t get traction?


It was called Tapir User Manager but the web site was down for some time. It was an open source failure but a career success because I used it around 8 projects including the arXiv preprint service, a voice chat service that got 400,000+ users, and the web site for our county green party (which had national impact.)


You need to combine it with Perl and a collection of other special passes, of course: https://web.archive.org/web/20180309134414/http://thewml.org...


I did that once. Never again.

Just because it worked for sendmail is not sufficient justification for anything.


A lot of older unix software config is complicated and cryptic.

sendmail, bind, apache, older X11, sudo are examples that come to mind.


no shit. not everything is yaml and emoji status lines.

but that's got nothing to do my mistake of using M4 when better tools existed.


I too had a small web site with M4 around 1999/2000. Why M4? Because I'd learned enough of it to be useful/dangerous when wrestling with Sendmail, and it seemed to do the trick (at least when the trick was simply "be easier than manually editing lots of HTML files every time there's a site-wide change").

I suspect I was never doing anything complicated enough to encounter the gotchas mentioned by other commenters...


haha, utterson also uses m4 for templating: https://github.com/stef/utterson/tree/master


I like it that (almost) every dev blog I come across on HN has an RSS feed.

For every interesting article that I read here I follow the feed. Whether you have a Wordpress site, a Bear Blog, a Micro blog, a blog on Havenweb, or a feed on your self-built site, I add them to the 'Really Social Sites' module of Hey Homepage.

Ultimately, I would like to publish this list of blogs, just like Kagi now does with their Small Web initiative. But I guess curating is key to adding quality. And when I think about curating, starting some kind of online magazine seems only natural.


I'm trying to understand (as a dev) if there is something "wrong with me" for not wanting to have my own blog. Where do people get the "entitlement" (I mean that in the best way possible) to share with other people/assume other people care what they are working on? It feels like a competition sometimes. "I need to work on something as cool as possible so I'll get some likes/impressions on my blog".

Collaboration is obviously cool and only works with making it all public, I just don't know where "I'm doing this because I think it's cool" and "I'm going to put effort in to share it with others to get reactions"


I have a blog, but I mostly assume people _don't_ care what I'm doing or thinking. Some of my posts have probably never been read by anybody. I still personally find it worthwhile for a few reasons:

- The mere possibility that someone will see it pushes me to put more thought and effort into what I write. Sometimes this reveals weaknesses in my ideas that I would have glossed over if I were just writing private notes for myself; sometimes it leads me to actually change my opinions. It also means the blog posts are easier for me to understand / get value out of than notes are if I come back and reread them years later.

- It creates opportunities for people to connect with me which can pay off at unexpected times. Occasionally people have reached out to me to say a post helped them or resonated with them, or to give a thoughtful reply or ask a question. Those sorts of interactions are really satisfying even if they're rare. (One time, I was interviewing for a dev job and the interviewer asked a question about a post I'd written on the philosophy of John Rawls, and how it could connect to software engineering. I found that absolutely delightful.)

- It's just nice to have an outlet when I feel like writing about something.


I don't have a blog myself but am this close to creating one.

Some guy said that it's a progression.

You start using the web by being a casual reader. At some point you get more comfortable in public spaces and start replying small comments like you would reply to someone afk.

Then you start reading more and more about specific subjects, amassing knowledge, and your replies have more content. They start being organized. They have a structure, to guide future readers and show them how you came up with your conclusion. They have links to sources. They leave open doors for the parts you don't know.

Then you start writing more and more comments, with more and more content, as a result of your experience.

Then comes a moment where you realize you're going to write the same thing for the nth time, and being a good engineer with a focus on DRY, you want to write your thoughts once and for all and link to it every time. This is the moment you start writing a blog that you actually maintain: you write not because you feel the need to write more, but because you want to write less and direct people to it rather than repeating yourself.


I don't think there's something wrong with you. I also think there's nothing wrong with people sharing _interesting_ stuff, whether they do it ultimately for shallow likes or for ... you know... just sharing _interesting_ stuff.

On a side note, I get the "entitlement" from nobody. I take it. I also mean that in the best way possible. Nobody's asking for my software, my (future) articles, my point of view, etc. Still, I make stuff and sometimes share stuff. I think it can be a net value for some people (definitely not for everyone). This is only the reasoning behind it, the main motivator was me realizing I matter as a human being and I have only one life to live. I learned that because of experiencing a 'dark night of the soul' a couple of years back. Luckily I got through. And to be honest, if it wasn't for the internet - made up of personal websites and real people sharing their own experience on forums - that taught me everything there is to know about Cluster B disordered personalities (just an example, cough nothing personal cough), I don't think I would be sitting here typing this lengthy response.

I realized I can not sit back, enjoy the decline of the internet, and only complain about it. I would love to see the web have a lot of personal websites and blogs about every kind of subject, so I started to build a website software. The web/internet, and all the information shared and made easily accessible, made me able to save myself. I was probably helped more by some random dude who put up a website fifteen years ago with everything he knew about certain stuff than I was helped by anything else.


Odd take. If you spend several hours figuring something out, it’s quite neighborly to write it up for the next person. “Shoulders of giants” and all that.

I’m certainly grateful for their help, and even written up a few of my own.


I think the bloggers are a classic vocal minority, nothing to feel weird about.


> a classic vocal minority

Not saying you're right or wrong, but I myself don't want to look at it like it's a competition of the loudest people.

I've read so many blogs through HN over the last years, and every one of 'em had something interesting to say while also portraying something personal from the author. Whether that's a nice layout, nice color scheme, or even some nice jokes in their bio text.

To me, it can not get any more human than this. Pure individuals connecting on a world wide web. By links, by email, by RSS feeds. All without big tech.


I agree with everything you wrote — what I was trying to communicate is that there’s no shame in not feeling the urge to share as it happens to be that the vast majority of us, like the gp post, don’t but that’s not easy to see or quantify.

Aside, I almost wrote “silent majority” but that seemed like it was veering towards politics so I went with vocal minority; I suspect there is a better term out there but I didn’t find it quickly.


I admit I interpreted more in your short post than was there. People definitely should not feel shame for not feeling an urge to share!

I still encourage people to share though, because I think a lot of people would like to read personal stuff about topics that interest them. Doesn't even have to be with your name and all next to it, anonymous/pseudonymous homepages are usually possible.

Therefor I offer free websites (on a subdomain though) for people that would like to write or post photos about their hobbies. And know that there are way more possibilities to go online, just look at the OP of this thread with a nice SSG.


> I offer free websites

And I am so glad you do!

Writing my longer reply I realized that early social media is a strong counterpoint — people absolutely loved to share when the barrier to doing so was low, the platforms hadn’t been given over to commercialization, and it was less obvious that those details were going to be ingested into an advertising profile. It sounds like you offer a bit of that without the motive or intent that turned mainstream social media into what it is and I think that’s great!


Thanks!

Yeah, somewhere between the homepages and webrings of the nineties and the added social functionality of the early social media platforms. Ideally without the platforms and their incentives. The web itself is already a social platform, a social medium. No need for more layers, especially if they ultimately are against my interests. I think RSS still holds the potential to connect individual websites/people, albeit in a slightly (or maybe even fundamental?) different way than the social media platforms do.

Question: what would be your number one topic/subject to blog about, other than anything tech related?


I have no idea! I tend to be more interested in replying than writing top level comments these days.


"WWW - let's share what we know"


Und die gedanken sind frei!


There’s also https://prose.sh which is similar to bear blog.


Sounds familiar, I might have seen it here on HN. I like their 'Discover' page with an overview of interesting posts from others!


A friend of mine described using make to generate scientific papers. He explained that if he changed a single test file, the entire paper could be regenerated including running tests and generating graphs the changed test with a single command.


It's a neat idea, though I have to point out that if you're already pushing to Github, you could just push the source and Github will publish your markdown as a hosted page: https://pages.github.com/


But that makes you dependent on GitHub for more than just dumb hosting - better make sure you can run the site generation locally from the start.


Fair point, the makefile is nice and portable.


I love the code [1]. Mine [2] is a bit over engineered because I wanted hot-reloading (without JS), and it was a delightful yak shave.

But the basic idea is the same --- heredocs for templating, using a plaintext -> html compiler (pandoc in my case), an intermediate CSV for index generation. Also some handy sed-fu [3] to lift out front matter. Classic :)

Very nice!

[1] https://github.com/karlb/karl.berlin/blob/master/blog.sh

[2] https://github.com/adityaathalye/shite

[3] I'm doing this: https://github.com/adityaathalye/shite/blob/master/bin/templ...

  case ${file_type} in
        org )
            # Multiline processing of org-style header/preamble syntax, boxed
            # between begin/end markers we have defined. We use org-mode's own
            # comment line syntax to write the begin/end markers.
            # cf. https://orgmode.org/guide/Comment-Lines.html
            sed -n -E \
                -e '/^\#\s+shite_meta/I,/^\#\s+shite_meta/I{/\#\s+shite_meta.*/Id; s/^\#\+(\w+)\:\s+(.*)/\L\1\E,\2/Ip}'
            ;;
        md )
            # Multiline processing of Jekyll-style YAML front matter, boxed
            # between `---` separators.
            sed -n -E \
                -e '/^\-{3,}/,/^\-{3,}/{/^\-{3,}.*/d; s/^(\w+)\:\s+(.*)/\L\1\E,\2/Ip}'
            ;;
        html )
            # Use HTML meta tags and parse them, according to this convention:
            #    <meta name="KEY" content="VALUE">
            # cf. https://developer.mozilla.org/en-US/docs/Learn/HTML/Introduction_to_HTML/The_head_metadata_in_HTML
            sed -n -E \
                -e 's;^\s?<meta\s+name="?(\w+)"?\s+content="(.*)">;\L\1\E,\2;Ip'
            ;;
      esac


I found his GEMINI approach quite funny - it strips out most of the formatting with a regexp.

There is a bit of a limitation, though - I organize posts by namespace and with the date in the URL, and make can’t really handle that directly.


> I found his GEMINI approach quite funny - it strips out most of the formatting with a regexp.

Do you mean the regexp in https://github.com/karlb/karl.berlin/blob/master/blog.sh#L4 ? It doesn't remove the formatting, just HTML comments (because they would show up on the page, otherwise) and rel="me" attributes (because they don't work with md2gemini). Feel free to read the blog post about adding Gemini support for more details: https://www.karl.berlin/gemini-blog.html


Huh, I previously skim-read the code and didn't notice the GEMINI regex detail. I wonder why they're doing that.

Re: namespace organisation. I thought about that a lot, and decided to adopt namespace-only convention for symmetry between text file layout, html file layout, and url scheme.

I've treated Date/time as metadata, which I can use to organise index pages. If I get to years worth of posts, then I'll group them by year/month or something reasonable. Likewise tags. I debated tags _and_ categories. But I decided on "everything is a post with tags, and categories will emerge based on topical coverage + post format".


Seeing these sorts of scripts is exactly why we don't write our own, and use something like esbuild and vite.


Well, I have my reasons and you have yours!

For example,

A) Most importantly, I wanted to tinker and have fun!

B) I already use Bash at work and stuff, so it's easy for me.

C) I am generally averse to fast-changing dependencies, and giant dependency trees, so that rules out most scripting languages.

Besides, if you peruse the README, you will see that my code guarantee is "works on my machine". Your mileage will vary tremendously :)


If it's just for fun then write your bundler in assembly for all I care!


You never know :)


The benefit of make is that large programs that are built by slow compilers can be incrementally rebuilt much faster in the face of small changes. Something that would take 40 minutes to do a full rebuild can build in three seconds or whatever.

If your static site can be generated from scratch in under a second by just catting a few hundred HTML files with a common header, there is no benefit to using make over a script. You only risk performing an incomplete build due to a bug in the dependencies.


If the file dependencies don't actually matter, you can mark the build targets as .phony

And still get to have things like make build vs make push, etc.


If dependencies don't matter, make isn't the right tool.

Your scripted actions can be ./build and ./push.

If you feel you need to type the name of a tool before your command, you can do that: sh build, sh push.


Filling your project namespace with a half dozen 2 -3 line scripts is not a big win.

One script with a switch case might be better.


Wow, this is almost exactly what I was planning to do for my site. For another small project, I wrote a tiny shell script as a makeshift "bundler" (just embeds the CSS and JS inside the HTML) with the goal of also being able to serve the unbuilt files locally:

    sed \
    -e '/[/][/]script/r index.js' -e '/[/][/]script/d' \
    -e '/    <script defer src="[.][/]index[.]js"><[/]script>/d' \
    -e '/[/][*]style[*][/]/r styles.css' -e '/[/][*]style[*][/]/d' \
    -e '/    <link rel="stylesheet" href="styles[.]css">/d' \
    index.html
The HTML contains something like this:

    <link rel="stylesheet" href="styles.css">
    <style>
    /*style*/
    </style>
and the script just deletes the <link> tag and replaces the /style/ comment with the contents of styles.css. Definitely not my finest work but it worked well enough.


Interesting. So I'm a weird sort, I imagine, in that I'm the type that has been using Linux and shell scripts for 20+ years, but never actually done any big-time coding, and thus I really don't know "make."

Point being, I do something very similar to this; except I first simply write/create my website in Zim-wiki, but then I have a bunch of little tasks to "clean up," i.e. fix/modify some links and then use the Canvas API to update my main course page (which, because I hate Canvas that much, simply links out to my own site).

Why make instead of shell scripts?


Makefiles honestly are just glorified shell scripts. Some of the syntax is a little odd, but you trade that for a more standarized format and the ability to add different build options without mucking around with argparse yourself.


> Makefiles honestly are just glorified shell scripts.

Not really. The concept of a script is "do these things in this order".

The concept of a makefile is "here are a bunch of things that might or might not need to be done, you figure it out".


As someone who also writes a lot of shell scripts and has for decades, I’d guess that if you learn just a little make, you’ll find lots of non-coding uses for it and wish you’d learned it earlier. ;) It’s just another great unix tool that is sometimes very handy to augment shell scripts when you need it, not unlike find+grep+sort, cut+join, awk+sed, gnu parallel, etc.. I think make is under-appreciated for its uses outside of code compilation. I use it anytime I’m making gnuplots, or doing image processing on the command line, for example. Whenever you have a batch of files to process, and the batch might need to re-run, and it transforms into other files or a single big file, then make may be the right tool

Make has at its core one thing that would be pretty tedious to do in shell scripts: update the target (output) file only if it’s older than the prerequisite (dependency/input) file(s). This applies transitively to files that depend on other files that might change during the run of make, which is the part that really separates make from a shell script.

The thing you do when the target needs updating is run a little snippet of shell script, so that part you already know.

After learning how a rule works, you can combine it with ‘pattern rules’ to abstract a rule over all files that share a common prefix or common extension. Suddenly you have a loop but without any loop syntax, and can process a thousand files on demand with 2 lines of make - and without modification you can change a single input file, have it process only a single output file, and not waste your time re-running the 999 other files.

Also pro-tip: make will run in parallel if you use -j, and it will do it without breaking any dependency chains. If you have a process that turns text files into sql files and then turns this sql files into html files (possibly nonsensical example), make running in parallel will not blindly update html files first, it will run parallel jobs in the correct dependency order. You can use make to build something like gnu parallel, but is able to resume a batch job that was interrupted in the middle!


My number one reason to use make is to have a single centralized location for project commands. If I see a Makefile at the root, I can quickly scan it and have an overview of what high level actions I might execute.

Note that I have recently switched to Just (https://github.com/casey/just). While not technically the exact feature set as make, it covers all of the ground for what I typically do. It gets the benefit of history and discards a lot of make cruft to make a more predictable experience.


So..dangit.

The other thing I've been working on is a personal calendar todo list thing; after years of trying to use other peoples things, I realize I know just enough to make something that works with how I think and what I need (in short, a little like Linux Remind, but with the ability to mark things as done in a way, and the flexibility for "due" dates to go "stale" and still show up)

Anyway, the units/items are individual files with content + some tags with dates.

And I've been doing a thing where I source a file with a ton of functions. And..it looks like this is a better version of that. Wild. hmm.


I used a shell script for that, but vaguely thought of changing to a Makefile for a while, and finally did now, thanks to the article reminding of that; it is more appropriate. Though the shell script still invokes make, and then rsync, since rsync seems less appropriate for a Makefile. But now it synchronizes fewer files.

As a side note, I am quite happy with XSLT templates to produce the pages (instead of attaching a static header, as in the article), as well as to generate indexes and an Atom feed.


It’s fun to make your own SSG tool, and this is a great example of keeping it simple.

It’s also interesting to read so many comments of people doing similar things.

For my own site, I find that I want an SSG tool that is simple, intuitive, and stays out of the way. With these goals in mind, I have been able to slowly improve my tool over and over. It’s been awesome to be able to do more using less.


It also makes you realize what they actually are.


Absolutely! What mental model did you arrive at?

Mine is "an SSG is just a source to HTML compiler and compositor, plus file organiser".

I reviewed a few tools (jekyll, emacs.love/weblorg, hugo), and ended up making mine in big part because I went down the rabbit hole of "well, why is this part here? why is it like this? why can't we do this other thing? wow this bit is cool, how do I make it myself?".


Hmm, is this SSG tool public? like, on GitHub or something?


The syntax takes a little trial and error and usually finding real-world examples, but I like "make".

I had one project that involved downloading a jdk, using it to build a project specific jre, patching some class files with local java sources, bundling it, etc.

Without being a make expert, it took me a couple of hours of reading, finding examples, etc...but now I have the dependency stuff working perfectly. Where it now only re-downloads or re-builds things if something upstream from it changed, handles errors well and points out what broke, etc.

All that to say, for some things, it's worth looking into for it's built-in dependency chain stuff. When you need to run arbitrary outside scripts/tools, it sometimes does things that other build tools can't (gradle in my case, couldn't easily do this, or at least I couldn't figure out how).


Make is excellent for tasks where you have a file that needs to exist and steps that reliably cause it to exist.

Excellent too for building a tree of things that depend on prior stages; in your example needing java to run a java applet which generates a file.

the syntax takes some getting used to, but in those cases there is little better.

But I do find people using it as a glorified task runner and it works but it’s quite possibly the least ergonomic tool available. - especially for things like making docker images, which I see extremely often


There's not really any better option though. Once you start setting environment variables, customising tagging, adding file prerequisites, handling building from a different directory (monorepo requiring access to shared resources), you need some sort of wrapper, with options, argparsinc and some rudimentary dependency checking. Make is a super low barrier to entry solution that with a small workaround (phony targets) gives you all of the above.


I did some similar experiments some time ago. It includes Makefiles, Rakefiles, SASS, Ruby erb, Jade, m4, and a few other tools.

https://github.com/W4RH4WK/static-page-generators

Over all, I quite like Ruby since it comes with rake and erb.


Rake is probably my favourite of them all, but it adds a dependency on ruby. We've settled with using make and accepting the limitations


Tossing mine in the pot too: make + pandoc: https://ordecon.com/2020/07/pandoc-as-a-site-generator/index...


Just a couple days ago I set up a site with GitHub Pages and used a very similar setup.

I learned about envsubst in the process which let me fill in values here and there. This is the rough way the homepage works.

    public/index.html: index.md template/header.html template/footer.html
        cat template/header.html > public/index.html
        DATE=$(shell date +%Y-%m-%d) envsubst < index.md | npx marked --gif >> public/index.html
        cat template/footer.html >> public/index.html
GitHub’s newer version of pages that lets you deploy via GitHub Actions rather than being forced into using Jekyll is just so amazing. I have converted a bunch of static sites to using it as hosting.


Just as straightforward as this is another method that has worked for me for years (decades?).

Server Side Includes.

Have a separate files for the header, content and footer and only edit the content files.


I was there 3,000 years ago… I used SSI’s in the late 90s/early 2000’s before replacing all that with PHP which was a huge step up at the time. I am more than familiar.

The convenience here is largely in being able to use GitHub Pages to host the page while being able to do almost anything you want for a build process. It’s really neat.


I did something similar for mine, I do markdown-to-html using pandoc, then replace the language labels using find (so that prism.js works). I've got it all running via a little Python script (I would've done bash but I'm terrible at it) to generate all the the files easily, rather than going through one-by-one: https://git.askiiart.net/askiiart/askiiart-net/src/branch/ma...

I might move to something make-based like this, looks interesting.


Update: I made it into a bash script, and now it only runs on changed or new files. Far more efficient, both because it's just bash, and because it only runs on what's needed.

https://github.com/askiiart/askiiart.github.io/blob/main/md2... or https://git.askiiart.net/askiiart/askiiart-net/src/branch/ma...


just use (gnu) parallel


Most Static Site Generators generate blog from markdown, which is not feasible for projects like company websites etc. For such projects I like Middleman (https://middlemanapp.com) which provides layouts/partials and things like haml templates.


Nanoc is great in cases like this too. It does less out of the box than Middleman but is easier to extend.

https://nanoc.app/


This amazing course by Avdi Grimm on make and rake for the same purpose has completely changed my understanding of rake and I recommend anyone checking it out:

https://graceful.dev/courses/acapa/


Author here. Nice to see people appreciate simplicity. If you have any questions, feel free to ask!


I do this too! But my make-fu isn’t as good. I’ll use what I learned from here to make it better.


I as well moved to this sort of approach a few years ago [1]. Definitely like the simple approach and it just stays out of the way.

[1] https://tiehu.is/blog/blog1


Shameless plug for my shell based static site generator https://mkws.sh. You can replace the bin/mkws script with a Plan9 mk file anytime.


Very cool! How long did it take you to make something like this?


Well, I implemented the main idea in a day or two. That being the pp preprocessor. The rest, I really can't remember, it was mainly grunt work to see what are the minimum things a web site is required to have. I still have some stuff to remove.


> There are no exotic dependencies, nothing to maintain and you can quickly adapt it to your needs.

Yeah kinda, except that most people making static sites aren't the people who know Make.


I regularly see people on HN who have a static site for their personal site or blog. There's a niche for this kind of thing.


Up until a certain point, yes. Then you start wanting back links, navigation, etc., and doing that with make alone doesn’t quite work, especially if you have a deep tree of files - single folder sites don’t typically have a lot of content in them.

(My site is generated by a Python commit webhook that indexes new files, diffs an Azure storage container and uploads updated files only).


it's a crazy concept, but people are willing to learn to use a new tool for their hobby project.


Isn't this "cat" as a static site generator target than "make"? Make is just the build system invoking the static site generator.


Bash/Zsh as a static site generator would be less steps.

Or Github actions workflow as static site generator would only include one shell wrapper, instead of 2.


I've been doing this for decades. It works very well, it can handle complex cases, and it ports trivially between different hosting systems.


What does make bring to the table here compared to a shellscript which loops over the files in the source dir?


Make provides incremental execution of the build graph. It is aware of the dependency graph, and the state of inputs and outputs. It will (ideally) only execute the minimal build rules necessary.

A shell script that checks file mtimes to determine what has already been built, and therefore what can be skipped, is close in spirit.

Make variants like GNU Make have additional functionality, like automatic concurrent execution of parts of the build graph.


I use make and pandoc as my static site generator! Generates a good website from my markdown notes.


utterson https://github.com/stef/utterson/tree/master is generating blogs using make for a 14 years now...


Funny I am setting up a blog with Pelican which uses “make” for executive control.


I'm gonna use bazel


Is there a GitHub workflow available for this or similar tool?


As long as we're sharing our own projects...

One of the things I did during the pandemic lockdown was work on the simplest possible blog in a single html file. Something that requires essentially no technical knowledge beyond typing text into a file. I recently dusted it off and yesterday I posted the most recent iteration.

Demo: https://bachmeil.github.io/minblog/blog.html

Source: https://github.com/bachmeil/minblog/blob/main/blog.html

There's very little styling, but that's not the objective (and it'd be trivial to add).


hey I wrote almost this exact blog post 15(ish?) years ago except mine used m4 as an "exotic dependency" ;)


"No exotic dependencies and nothing to maintain"

Oh really?:

    sed -E 's|(href="$(subst source,,$<))|class="current" \1|' header.html | cat - $< > $@
a sed script that modifies HTML fragments isn't nothing. And this just does one thing that you might want to customize in the header for each page. It doesn't do things like handle pages having different <title> tags. Every feature like this that you come up with becomes another thing to maintain, and another thing that can catch you out later. When you come back to fix this later, will you even remember what it's supposed to do?

From a web publishing point of view, make and sed are exotic dependencies. There's not going to be a bunch of helpful pointers online to help you debug issues with using them for this purpose. When you google how to fix your sed regex to match specific HTML attributes and not match others, you're going to find stack overflow posts about Zalgo, not quick answers.


Or maybe don't make a site generator and just make a website using HTML files? You'll spend a lot less time painting the shed and a lot more time actually putting your ideas/content on the website.

If you want templating then use server side includes. It's much less attack surface than say, PHP or some complex CMS, but you can still just make a footer.html and stick in at the bottom of every other .html page easily and avoid the one problem with file based sites: updating shared bits.


If only there were a way to make consistent structure in an online document (such as hypertext) and separate the styling into distinct files. Even better, what if we could make separate styling for mobile, desktop, and printing, all with the same content?

If only it were possible using existing standard web technology. Sadly it was never designed with such goals in mind.

/s




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: