Hacker News new | past | comments | ask | show | jobs | submit login
Spack – scientific software package manager for supercomputers, Linux, and macOS (spack.io)
157 points by telotortium on March 20, 2023 | hide | past | favorite | 100 comments



Original author here -- since I know that people click the comments to decide whether to read further, here's a (biased?) summary:

Spack is aimed at end users, developers, and admins. It's used primarily in HPC at the moment but it would be great to see use outside of the HPC community.

Spack supports:

1. Basic installation (spack install foo, foo@3.1.2, foo@3.1.2 +option, etc.)

2. Reproducible environments, via `spack.yaml` and `spack.lock` files that you can version in a repo. You can use this to build a custom software stack, build some dependencies for a project, etc.

3. Building many versions of the same package, e.g. these are perfectly fine and can coexist:

    - spack install hdf5
    - spack install hdf5 %clang
    - spack install hdf5 %oneapi@2023.2.0
    - spack install hdf5@1.18.0 cxxflags="-O3 -fast" target=cascadelake
    - spack install hdf5 +mpi ^mpich
    - spack install hdf5 +mpi ^openmpi
4. Both source builds and installation from relocatable binary caches

5. Lots of conditional features in packages that other systems do not offer, e.g.:

  - injecting compiler flags
  - building for specific uarch targets
  - optional dependencies
  - very fine-grained options (e.g. +/- cuda, set cuda_arch, etc.)
  - building the same package with different compilers
  - building the same package with different potentially ABI-incompatible dependencies like MPI or different boost versions.
5. Autogenerating things like container recipes, CI pipelines, Lmod/environment modules, etc.

6. Good support for building against external packages that may already be on your system.

At a very high level, Spack has:

* Nix's installation model and configuration hashing

* Homebrew-like packages, but in a more expressive Python DSL, and with more versions/options

* A very powerful dependency resolver that doesn't just pick from a set of available configurations -- it configures your build according to possible configurations.

You could think of it like Nix with dependency resolution, but with a nice Python DSL. There is more on the "concretizer" (resolver) and how we've used ASP for it here:

* "Using Answer Set Programming for HPC Dependency Solving", https://arxiv.org/abs/2210.08404


> 6. Good support for building against external packages that may already be on your system.

Able to use 'internal' packages (spack install openssl) or external packages (link against /usr/lib/libssl.so) on a per item basis: sometimes you want the newest code, but other times you want the OS package so that you can get security updates 'for free'.

Most other system do one or the other: Spack allows you to choose.


What do you think about Spack vs. Conda/Mamba? I've been a contented (maybe not "happy") Conda user for years. Is there any reason to switch? I'd be concerned about package availability.


We have a guide for this written by Adam Stewart, probably our most prolific (package) contributor:

* https://spack.readthedocs.io/en/latest/replace_conda_homebre...

The thing you'll likely notice most about Spack vs. conda/mamba/rattler(?) is that it lets you pick your versions and options much more fluidly, and it'll build you a package from source if there's no binary for it. Spack is much more seamless for integrating binary installations with source builds. `conda-build` is a whole other tool outside the normal user workflow.

Spack has over 7,000 packages at this point, so I don't think the lack of packages is going to hurt too much. We don't offer nearly as many public binaries yet, though, so the source builds may make the UX more painful, depending on what you're trying to do.


Spack has Anaconda as a package:

* https://spack.readthedocs.io/en/latest/package_list.html#ana...

* https://spack.readthedocs.io/en/latest/replace_conda_homebre...

So you can continue using it, but you also have a super-set of features for other things as well.


Very neat:) Given the similarity in features, could you comment on how it compares to nix?


Main differences with Nix:

* Spack has a dependency solver (concretizer). nixpkgs is managed by humans who do the dependency solving for you over time. What this means practically is that you can install 5 versions of some package with different dependencies and flag combinations with a few commands in Spack, while with Nix you would probably need to check out different commits of the nixpkgs repo and maybe do some hacking on nix derivations to get the same thing. Spack is fundamentally designed to help with combinatorics.

* Nix builds all the way down to libc; Spack (currently) doesn't -- it's designed to live on an existing system.

* Nix build environments are isolated (but require root to run); Spack's aren't as completely isolated and don't require root (so you can run Spack in your home directory.

* Spack installation hashes are what Nix would call full configuration hashes -- they're configuration metadata hashes, not content hashes of the installation. So you could say we're not quite as committed to exact binary reproducibility, but you could also say that this allows us to support relocatable binaries (which Nix does not). Also, our metadata is pretty detailed.

* Spack's DSL is Python; Nix is, well, Nix.

Spack is very much inspired by Nix, and we talked about this a bit in the original paper here:

* https://tgamblin.github.io/pubs/spack-sc15.pdf


> with Nix you would probably need to check out different commits of the nixpkgs repo and maybe do some hacking on nix derivations to get the same thing. Spack is fundamentally designed to help with combinatorics.

Typically you'd use the overlay mechanism combined with package overrides for this, so you would have to write some Nix code, but you can do it out-of-tree (the most common location of choice would be inline in your configuration repo).

One of the usability issues with this is that the APIs for overriding dependencies and flags in Nix packages are sometimes a bit language ecosystem-specific. The uniformity of Spack's approach strikes me as an even bigger win over Nixpkgs here than the fact that you can append those tweaks directly to CLI invocations without saving any code to a file. :)

> Spack installation hashes are what Nix would call full configuration hashes -- they're configuration metadata hashes, not content hashes of the installation. So you could say we're not quite as committed to exact binary reproducibility, but you could also say that this allows us to support relocatable binaries (which Nix does not).

Nix supports content-addressed store paths but it's still not very widely used yet, and it's opt-in. A normal system with the feature enabled will have a mix of configuration-addressed and content-addressed paths.

Spack looks really, really cool. Thank you for advancing the state of the art. :D

I hope Spack's package collection is seeing and will continue to see exponential growth like Nixpkgs and Guix are! It seems like the stronger design fundamentals in these newer package management ecosystems can really enable that.


I'm not sure if I understand the first point. Doesn't Nix currently achieve the same thing feature-wise?

For example,

    spack install packageX %libA@2.0
would translate to

    nix-build -E 'with import <nixpkgs> {}; packageX.override { libA = libAv2_0; }'
For this to work, both Spack and Nix would have to package libA v2.0. If it isn't, the user would need to create their own. Assuming a different version of libA is already packaged in Nix, here's how it would look like without the "nix-build -E" part:

    let
      pkgs = import <nixpkgs> {};
      libAv2_0 = pkgs.libA.overrideAttrs (old: {
        src = pkgs.fetchzip {
          url = "...";
          sha256 = "...";
        };
      });
    in
    pkgs.packageX.override { libA = libAv2_0; }
I assume something similar would be required for Spack too.

So aside from Spack having a nicer shorthand syntax for customization, I don't get what Spack can do that Nix can't in terms of features. Or to be more specific, how a dependency resolver can eliminate human work.


Spack packages are parameterized, so there is one `package.py` file per package, not several, and you can have many versions and options declared. See, e.g., `zstd`: https://github.com/spack/spack/blob/develop/var/spack/repos/...

This was a conscious decision we made to allow many versions to be built without requiring split implementations and without checking out different commits -- that saves some work already. It also forces the repo maintainers to consider the other use cases, which tends (IMO) to make the recipes more portable.

But the real question is not just work saving but correctness. You cannot always just swap in libA@2.0 as in your example. For:

    packageX ^libA@2.0 
Suppose that both packageX and libA depend on MPI (a versioned standard for which there are several implementations) and, further, libA requires MPI@3. Spack will ensure that:

    1. packageX and libA use the same MPI implementation (e.g., openmpi, mpich, mvapich)
    2. the MPI implementation chosen satisfies the provider requirement
    3. any version/option constraints that packageX has on MPI are satisfied along with those of libA.
You could also imagine that the two packages might have conflicts with certain implementations of MPI, e.g. say packageX conflicts with mpich and libA conflicts with mvapich. You'd have to choose mpich.

This isn't just an overlay; it's a constraint solve. The choice of libA@2.0 can have effects on other nodes in the graph, and different choices may need to be made elsewhere; maybe even things like disabling options or choosing different dependencies.


> nixpkgs is managed by humans who do the dependency solving for you over time

One nice feature of Nix is the "import from derivation" pattern: if some Nix definition happens to call the built-in functions 'import' or 'readFile' with a path to some other Nix derivation (AKA build product), Nix will first build that derivation, then import the resulting file to finish evaluating the original definition.

This way, we can have one derivation which runs some arbitrary dependency-solving command; and import its result as part of our package's definition. We do this where I work: we run the `mvn2nix` command inside one derivation, to get a JSON file describing our Maven project dependencies (jar and pom files); this is imported, and used to define a self-contained folder of those jar and pom files; then the main project definition runs Maven pointing at that folder.

The downside of this approach is that it slows down evaluation: Nix usually has an "evaluation phase", where we figure out what to build; then a "build phase", which runs those builders. When we "import from derivation", this distinction is lost: in order to figure out what to build (e.g. which Maven command to run, to build our project), we must first do some building (e.g. to figure out what should go in the folder that Maven will look in for dependencies). For this reason, the Nixpkgs repo doesn't allow definitions which use "import from derivation"; however, it's a very handy tool for our own personal or organisational projects :)

(PS: Many years ago I tried to do something similar for building Haskell projects; but never really got it to work :( )


+100

From what I saw the package definitions are in python. Given python is really an imperative language I don't see how spack could be comparably powerful for definitions and reuse.


The Spack DSL for defining versions, options, etc. is declarative. That part of a package exposes options to the solver and very much allows reuse, in the sense that the same package can be built many different ways, and even with different dependencies. The whole system is parameterized in ways that Nix is not.

The imperative part of a Spack package is the build recipe, which AFAICT is not so different from bundling a bash script in a Nix derivation. Just like Nix, it's included in our configuration hash, so if you change the recipe you get a different installation of the software.


Thanks for the answer.

> That part of a package exposes options ... in the sense that the same package can be built many different ways, and even with different dependencies.

It is quite a common pattern in nix to have package build flags and the dependencies as inputs in a derivation. This definitely makes it possible to build in different ways and with different dependencies.


Yep! I'm not a nix expert but my understanding is that this is typically done with overlays. So you can, say, swap out openmpi for mpich, but mpich requires a different version of hwloc that conflicts with the one defined in some other branch of your derivation, you're out of luck. In Spack that is handled through dependency resolution. There have been discussions of how to deal with this in Nix, e.g.: https://discourse.nixos.org/t/concept-use-any-package-versio....

Similarly, for packages like HDF5 that tend to be depended on with options that affect its API (+parallel, ~parallel, etc.), I do not think there are automatic ways to get nix to ensure that your request for hdf5+parallel is consistent with other packages in the graph.

Possibly worth a mention: for uarch flags like the ones shown here: https://nixos.wiki/wiki/Build_flags, the user is responsible for setting them specifically for the requested compiler. We've abstracted that with `archspec` (https://github.com/archspec/archspec) so that you can just ask for a target and the correct compiler flags are injected. e.g., `target=cascadelake` or `target=zen3`. So if you switch compiler, you do not have to look up these details. See https://tgamblin.github.io/pubs/archspec-canopie-hpc-2020.pd... for more on how it works.

The salient parts of archspec aren't in a python library but in a JSON file (https://github.com/archspec/archspec-json/blob/master/cpu/mi...) so it's something that could probably be incorporated into Nix.


The package definitions are quite declarative even though the language is imperative, see an example here [1]. For hashes, Spack uses the normalized (comments, whitespace etc removed) abstract syntax tree.

With Spack it's easier to contribute, you don't have to learn a new language as you need for Nix / Guix.

[1] https://github.com/spack/spack/blob/develop/var/spack/repos/...


After using it for ~8 years, I rather like Nixlang. At the same time, I don't know Python well or particularly like it, so Spack's DSL just doesn't look as comfy to me. It reminds me of RPM spec macros a little bit, which I guess are fine but I don't love them. I'd rather see Nix learn from Spack than myself jump ship.

But I think it would be very silly to dismiss Spack over something like the fact that their DSL is Python-based.

It's fine to be partial to what you know, and there are lots of good reasons to prefer Nix today, at well as to think that improving Nix might be better for you than adopting Spack.

But I think we should go a bit deeper than this in evaluating competing projects. Nix is about more than just being in some FP club, you know?


I could be persuaded to trade a certain amount of power for ease of use


Compared to homebrew, do prebuilt binaries have to be installed in a predefined path they were built in or is it also flexible?

Also, is there an indication/warning whether the package is prebuilt before you install?


> do prebuilt binaries rerouted to be installed in a predefined path they were built in or is it also flexible?

Prebuilt binaries don't have to go in a fixed prefix -- we'll relocate RPATHs, shebangs, strings in text files, and (as long as you build in a longer path than you install to -- see `padded_length` at https://spack.readthedocs.io/en/latest/config_yaml.html) strings in binaries.

There is not currently a way to see before you install if an install will be from binary or from source, but that is something we are working on.


So how would I know whether the prebuilt binaries were built in a longer path than I install to?

Is there at least a way to block source builds if the app is not in the binary cache to avoid any surprises (basically have manual override/confirmation instead of automatic fallbacks to source builds)?


> So how would I know whether the prebuilt binaries were built in a longer path than I install to?

All the public ones are built with very long padding; if you make your own you're currently on your own.

> Is there at least a way to block source builds if the app is not in the binary cache to avoid any surprises (basically have manual override/confirmation instead of automatic fallbacks to source builds)?

Yep:

    spack install --cache-only ...


lmod integration?


Yes:

* https://spack.readthedocs.io/en/latest/module_file_support.h...

Both Lmod and (TCL) Modules: can generate either/both.


An unfortunate name, a la Nonce Finance and Git.

https://en.m.wiktionary.org/wiki/spack


For Americans who don't want to click the link: "Spack" is basically a milder British-English version of "retard". Imagine how well the "Spaz programming language" would do in your market.


I think it would probably do fine, given we have languages like Brainfuck, image editors named The GIMP, parallel-processing tools named Linda (after the star of a pornographic film), etc.

I just don't think people are that sensitive about software naming.

There used to be a relatively popular package called "System Administrator's Tool for Analyzing Networks." A friend of mine still has a physical copy of O'Reilly's "Protecting Networks With SATAN."


Those names dont serve to degrade people. For me, as a British person, this software is basically called Retard.

I'm not sensitive enough to care but some people will find this offensive or at least make them pause before suggesting its use. Some word filters will filter this out too.


It's certainly not milder.


FWIW “retard” is not very mild at all in US-English nowadays, it has been essentially removed from even mildly polite conversation (which is to say, if I’m hanging out with friends, and we’re all happily swearing away as one might with friends, we’d still avoid it, it is firmly in the “slur” category).

This is a moderately recent occurrence in the US; around here I want to say very strong pushback started to occur around… I dunno, 15 or so years ago?

Anyway, I’m not sure how that compares. But “less serious than retard” sets the ceiling pretty high, for the US audience.


This is definitely an international blunder. I’d say it’s more socially acceptable in the UK to call someone a c*t than it is to use this word. Can safely say it’ll get zero UK adoption due to the name. Shame.


Given that it's in use at Cambridge and several other HPC centers around the UK, I'm going to disagree. I think most people are cosmopolitan enough to set aside the UK's niche insult community to move forward with real work. I doubt there's any single-syllable morpheme which doesn't translate to something offensive somewhere.


the UK's niche insult community

It's a horrible term that was used for years to denigrate disabled people and that it is used now in cloistered and privileged academic settings is no excuse for an ignorant dismissal like that of the hurt, bigotry and exclusion it conveys.


I'm UK native and had never heard of the word as an insult. I wonder if it's a regional thing?


Perhaps a generational (and maybe class?) thing as well; it and related terms were vicious and influential enough to have motivated The Spastics Society to change their name to Scope in the mid-nineties [1]; I guess growing awareness of disability discrimination and bigotry also contributed to it being recognized as beyond the pale and disappearance from casual conversation with the corollary effect of some people outside those demographics being ignorant of how awful those terms were.

[1] https://www.bbc.com/news/blogs-ouch-26788607


Well the UK has done an amazing job at getting rid of almost all uses of the word "spastic" due to the campaign against it, so it could be for that reason


I'm old enough to remember that and some other derivations being very common in the playground. Just not this one.


To be fair I didn't remember it until I put the "a" on the end, like "spacka". Then I remembered


"insult community"


And then there are those of us who deal with nonce daily. I wince every time I see the word, and it's really awkward if non-technical folks hear us talking about them.


Yeah, spack(er) is really up there for an insult. You really have to be extremely vitriolic to call someone a spack.


I think the lesson is that British slang doesn't count when it comes to potentially offensive open-source software names.


Have to say I was fairly surprised by this name.

It's a fairly old-fashioned expression, but could well cause some misunderstandings.


I wouldn't say that, it sees regular use even today.


To be fair, I should ask my kids.


Interesting, the same word with the same insulting meaning is used in northern Germany…TIL it‘s in English too. Maybe a leftover of the british occupation after WWII. Spacken Spacko Spack


Then there's "MongoDB", which is also slightly offensive in German, because Mongo refers to Mongoloid, a derogatory term for people with Downs syndrome.


Mongo has the same meaning in English, but as a Millennial, I've literally never heard this used in real life, only seen it referred to as an old insult. So it's quite likely that the founders and early users of MongoDB knew it was an insult.


Agree. Terrible choice of name, very offensive in the UK; a small amount of googling should have flagged it.


Git was very intentional. Nonce and this I expect were not.


Suggestion for international workaround: pronounce as “ess-pack”?


I think it fits. It’s like spackle, that you use on walls.


Thanks for the edification. Here I was thinking it sounded too Vulcan.


I’ve tried to like this more than once, but once you hit some cryptic error somewhere in the build you’re not going to have a great time.

Also unfortunately it doesn’t always play nicely with lots of high energy physics software which really leans on using LD_LIBRARY_PATH and running custom executables during builds. I’ve had bad experiences with ROOT and Gaudi with no obvious paths forward for fixing things.


Sorry to hear this. We have a lot of connections with the high energy physics community, which is why ROOT and Gaudi are even in Spack to begin with. There is an #hep channel on https://slack.spack.io, and we talk to those folks fairly frequently to figure out what we can add to support these codes better.

On `LD_LIBRARY_PATH` specifically: we intentionally inject RPATH-ing compiler wrappers into the build to avoid these types of problems, which are on by default for the ROOT and Gaudi builds. Would be curious to hear where the remaining pain points are.


We have used spack to maintain our 1000-person user environments at the electron-ion collider (including ROOT, Gaudi) for the past two years and couldn't have done it otherwise.


Yes, I've found it increasingly frustrating after managing a university central HPC system with rpms. (It's a myth that you can't support different versions with rpm or dpkg if necessary.) I was at least convinced of the need for an actual package manager -- i.e. not Easybuild that everyone else favoured -- after the hell of people's environment modules installations and combinatorial explosions.

For instance, several times I've had Spack in a state where it had obscure apparently internal errors that I didn't have the energy to try to diagnose, which resulted in throwing everything away and starting again. I found build recipes frequently don't actually work, especially if you're not on x86_64, though architecture accounts for a minority of the failures. You also end up quickly running out of space in ~10GB of home directory, at least if things aren't shared on the system.

There is a tension between "reproducible" static configurations and the dynamics you need, e.g. to be able to link instrumentation or acceleration libraries at run time. However, there is a Spack option to use RUNPATH instead of RPATH that I'd have to look up.


Both of those are non-trivial and annoying to say the least, I know ROOT has a few very smart people managing build recipes in conda-forge too.


The name is a bit unfortunate, to say the least: https://en.m.wiktionary.org/wiki/spack


I encourage anyone to follow the "install your first package" instructions, but pick a real scientific package like WRF or GROMACS. Be prepared to wait a while though...


If you do pick WRF, you can also use a binary cache, as shown in this workshop https://weather.hpcworkshops.com/ .


I haven't tried since the binary cache launched - but over the years I've attempted to use Spack following their "getting started" guide half a dozen times and achieved nothing but long waits and disappointment; part of the reason for recommending it was a sanity check on whether I am just impatient or have unreasonable expectations when it comes to software behaving as documented.


There's a good interview with the author in an episode of The Manifest: https://manifest.fm/11

It's been a while since I've listened, but I remember it being pretty interesting.


I'm surprised that no one here has talked about the build times yet. As a new user, the first thing you notice out of the box is that even on a relatively beefy system, you spend ~2 hours getting to a working state, even when you're building a relatively modest set of packages. This is particularly surprising since one of the points of Spack is to integrate with the existing system software, and so much of what you're building already exists and works fine on the system.

In the past when I've talked to people about this, the answer I got was "well, set it up once for your team and don't reinstall it all the time". Which is kind of awful from a "cattle, not pets" perspective. Several other design decisions in Spack (like the lack of certain kinds of configuration isolation by default, and the general difficulty of configuring certain kinds of things) make it feel like a "pets" system. I've gotten instructions from people on using Spack to install things that have been impossible to follow because they fiddled with something in their environment and forgot to tell me.

The best thing I can say about Spack is that all the other systems are worse. I just wish they could get around to fixing some of this stuff already.


Guix has a different take on this problem: it provides package transformations to modify the build graph on the command line (or in a Scheme file), which don't preclude the use of binary substitutes and reproducible builds.

All stuff is cached in /gnu/store so people won't have to rebuild packages, and with `guix publish` people can share built binaries with others (discovery via Avahi optional).


> This is particularly surprising since one of the points of Spack is to integrate with the existing system software, and so much of what you're building already exists and works fine on the system.

Try spack external find, and much of this problem goes away. It will register the existing packages as externals.

As to the rest, spack actually has public binary mirrors with builds of many packages for common distributions now, in addition to the E4S mirror. It may not be as easy to ensure use of these as would be ideal, but this situation has gotten much, much better over the last few years.


> Try spack external find, and much of this problem goes away. It will register the existing packages as externals.

To be clear, this goes through a hard-coded list of packages, and adds them to Spack. It cuts maybe about 20-30 minutes off your 2+ hour build time. And it doesn't find any libraries (MPI being the most obvious one you'd want). So, better than nothing, but far from what it needs to be.

> public binary mirrors

Is that on by default? I went through this only a couple of months ago and I don't think any of my packages came from mirrors.

One practical challenge in using this in practice is that you definitely want some of your packages built with the local compilers, with maximally native settings (so native MPI, march=native, etc.). Other packages you may not care about at all (autoconf, etc.). So while I believe that Spack allows you to do that, it doesn't necessarily make it easy to do. I suspect when you tell Spack to build something with a certain compiler, it just builds everything from scratch because it can't tell where you actually care about getting the most tuned code.


> And it doesn't find any libraries

There is some support for libraries with spack external find. For example, most of the ROCm libraries can be used that way.


> like the lack of certain kinds of configuration isolation by default

could you please explain a little what kind of "configuration isolation" you mean?

> and the general difficulty of configuring certain kinds of things

like what for example?


> could you please explain a little what kind of "configuration isolation" you mean?

It's been a while so I don't have the whole list off the top of my head, but there are several configuration files that live in $HOME by default unless you set SPACK_DISABLE_LOCAL_CONFIG=1. Those files can e.g., list compilers and software packages, if I recall, among various other settings.

> like what for example?

I was trying to follow someone's recipe for building something with Spack. I hit some sort of compiler error (sorry, it's been too long, don't remember). And the solution they suggested was to modify compilers.yml by hand to set various things. (Something about setting the right paths for using Cray's compiler wrappers with rocmcc...) That and I had to fuss with packages.yml to configure the system MPI. I remember trying to figure out how use "spack config set" to configure these files automatically, but the syntax was (a) completely undocumented and (b) not actually powerful enough in practice to configure either of those files. So I gave up on that, set SPACK_DISABLE_LOCAL_CONFIG and just saved the modified yaml files into my repository so that I can at least recover my settings if I need them in the future.

For advanced use cases, it seems like you basically need to modify these yaml files to get anything done, but the amount of documentation and support for doing this automatically is really lacking. (And the idea that you're supposed to require every user to do this manually is even more insane.)


This sounds great. I'd love to have something like this within nix.

Nix is great for full system management. Less so for something like python package management. Its lack of a dependency resolver or multiple versions of a package within the same nixpkgs instance makes it difficult to work with in some (lots of) instances. Generally, I use pipenv or poetry from within a nix flake to define the environment.

Spack doesn't seem to cover system management (please correct me if I'm wrong here, but it seems to be out of scope). Combining them: using spack within a flake to define and build a set of dependencies seems like the best of both worlds.

Unfortunately, at the moment, it doesn't appear to be available in nixpkgs.


Been using Spack for a while to manage my machine learning package dependencies. It allows me to quickly spin up projects with complex dependencies (my current environment has 329 packages built ...). It's pretty easy to use with containers. It allows me to evaluate and migrate to different PyTorch/CUDA versions easily.


I’m naive about HPC, but it’s always surprised me a bit that HPC software seems to be a fairly distinct ecosystem. I’m not doubting it’s for good reasons, but why is a special package manager needed for this domain?


HPC isn't exactly special, and neither to some extent is Spack. Spack's gotten some attention in talks at, e.g., CppCon.

The big differentiator is really the degree to which people want to tune and customize their builds in HPC vs. other communities, and the diversity of hardware that needs to be supported. Things that stand out to me:

  - different applications' needs to customize the builds of their dependencies
    - e.g., one app might need HDF5 with MPI support, another might want it 
      built without. Those are two different, incompatible HDF5 builds.
  - need for specific microarchitecture builds to take advantage of vectorization
  - need for GPU support (for NVIDIA, AMD, *and* Intel GPUs)
  - need to use sometimes vendor-specific tuned system libraries like MPI, cray-libsci, mol
  - need for specific *versions* of dependencies (solvers, mesh libs, etc.) for
    numerical reproducibility
  - need to integrate across languages, e.g. C, C++, Fortran, Python, Lua, perl, R,
    and dare I even say Yorick.
Most of these requirements are not so dissimilar from peoples' dev environments in, say, the AI community, where people really want their special version of PyTorch. Where monorepos are common in industry, they really haven't taken off in the distributed, worldwide scientific community, so you get things like Spack that let you keep rebuilding the world -- and all the microcosms in it.

So I'd say Spack is not so much a special package manager as a much more general one. You can use it for combinatorial deployments at HPC centers, dev workflows for people dealing with multi-physics and other complex codes, and as sort of a distributed poly repo with lock files.

The intent was never to be specific to HPC, and I would love to see broader adoption outside this community.


Thanks for the thoughtful response. That makes a lot of sense and these capabilities are not something you'll find in debian packages. It's not easy to ask for different compilation flags there.


First paragraph of the docs - "Spack is a package management tool designed to support multiple versions and configurations of software on a wide variety of platforms and environments. It was designed for large supercomputing centers, where many users and application teams share common installations of software on clusters with exotic architectures, using libraries that do not have a standard ABI. Spack is non-destructive: installing a new version does not break existing installations, so many configurations can coexist on the same system."


The need for HPC specifics is definitely oversold, if not by Spack. One thing is the need for proper package management for unprivileged users if they don't use container messes, and the fact that there's so much stuff to share recipes for. It would help if there was more attention to performance engineering, even engineering generally, like having the sort of dynamic micro-architecture selection that BLAS libraries typically do, or attention to library substitution via dynamic linking (e.g. BLAS again).


What does this offer over the usual suspects brew, linuxbrew, apt… ?


Multiple versions of the same package can be handy: e.g., Python 3.6, 3.8. and 3.10 all handled by the package manager and able to switch between them on a per-session/terminal basis (no need to completely disable a particular install: just adjust $PATH and $LIB as needed).

Able to create pyenv-like environments which can then be 'exported' and sent to other systems/users having Spack so that development/debugging can be done with the same setup:

* https://spack-tutorial.readthedocs.io/en/latest/tutorial_env...

Able to use 'internal' packages (spack install openssl) or external packages (link against /usr/lib/libssl.so) on a per item basis: sometimes you want the newest code, but other times you want the OS package so that you can get security updates 'for free'. Most other system do one or the other: Spack allows you to choose.


Spack allows you to have several matrices of the same package installed. So, if you want to install Foobar, and you want one copy built with gcc and linked to openmpi, one copy with llvm and linked to mpich, and three copies with the Intel compiler and three different versions of the Intel mpi libraries,spack can do that for you, and help manage your environment to ensure you are running the stack you intend.

It's primary use is in supercomputing to make various optimizations available to users.


The examples you listed are designed to install packages on a single system.

Spack is designed to install packages in a network-storage path that can be accessed by multiple semi-heterogeneous systems that run the same main architecture.


I was excited for Spack and used it for some small packages, but it didn’t have a proper binary artifacts then. Ended up settling on conda-forge and it’s okay.

If I was doing a large scientific computing project today, I’d probably reach for bazel with some conda rules hooked up to conda-forge to set up third party packages.


Not to be confused with the SWC bundler for JavaScript https://swc.rs/docs/configuration/bundling


How does it compares to other HPC software managers such as easybuild?


If I have to pick one difference, the fundamental one is that Spack does dependency resolution with a solver, while EB does not.

So, in Spack, you write a package.py file (like say the one for gromacs: https://github.com/spack/spack/blob/develop/var/spack/repos/...), and you can build that package with any of the versions/options/dependencies/etc. expressed in the package.

With Easybuild, someone (maybe you?) has to write a specific configuration (easyconfig) for every build you want to do. Want to tweak a version? Write a new easyconfig. The dependencies point at specific versions, too, so now write configs for all the dependencies, and so on. There is a lot of copying/tweaking of file trees in the EB workflow.

Aside from that:

  - Spack has a package database and allows you to query what is installed.
    - EB does not have "uninstall", just `rm -rf`
  - Spack builds from source and also handles binary packages (no binaries in EB)
  - Spack supports environments via `spack.yaml` / `spack.lock`; EB doesn't have that.
  - Spack doesn't require Lmod to work -- you can just `spack load` or `spack env activate` to use packages.
  - Spack builds with RPATH by default (much like Nix does) -- dependency libraries are found automatically even if you run an executable straight from its directory.
There's more but Spack is a package manager, while EB is really set up to automate a certain type of installation common in HPC.


EasyBuild lead developer here ^_^

Not all of this is 100% correct, so let me pitch in:

- EasyBuild currently doesn't have an uninstall option, that's true, but since every software installation sits in its own separate directory, it basically boils down to removing that directory + the environment module file that EasyBuild generated for it; - EasyBuild can install "binary packages" (see the 'Binary' easyblock). Examples are the Intel compilers, CUDA, etc. We don't provides pre-built binaries for software that EasyBuild installs from source though, that's true; - EasyBuild has no "environments" concept. The closest thing perhaps is the 'Bundle' easyblock, that can be used to "glue" different environment modules together. We mostly rely on the environment modules tool (Lmod) so this, see for example module collections: https://lmod.readthedocs.io/en/latest/010_user.html#user-col... - EasyBuild does indeed require an environment modules tool. Both Lmod (Lua-based) and Environment Modules (Tcl-based) are supported; - EasyBuild also supports RPATH linking (see https://docs.easybuild.io/rpath-support/), but it's not the default.

> "EB is really set up to automate a certain type of installation common in HPC"

EasyBuild is definitely geared towards installing software on HPC systems, but there's no "certain type of installation common in HPC": we support software using a wide variety of installation procedures. But maybe you're referring to installing software in a shared filesystem (NFS, usually something like /apps).


Maybe I should clarify:

> it basically boils down to removing that directory + the environment module file that EasyBuild generated for it

Another important thing is knowing what depends on that package (so you don't remove something else's dependency). This is something that EB doesn't track after installation time.

> EasyBuild can install "binary packages"

What I meant here is that EB has no binary packaging system of its own. It has no binary package format, no signing, no way to take an installation and bundle it up into a file that can be installed (potentially in a different location) on another system. Spack, Nix, and Guix all have systems for creating and (re)installing binary substitutes for source builds (we call them build caches). EB can install someone else's binary package/distribution (we can too FWIW), but it can't create its own.

> But maybe you're referring to installing software in a shared filesystem (NFS, usually something like /apps).

Yes -- specifically that, with environment modules. Spack installations similarly all go in their own directories, but you can compose them in many different ways with environments and views. e.g., you can:

    - Create a symlink tree of many packages together in a common prefix (a view)
    - "project" package installations into different directory layouts with symlinks, hardlinks, or relocated copies
See https://spack.readthedocs.io/en/latest/environments.html#vie...

And you don't have to load packages one by one with modules. You can avoid modules entirely with `spack load` or use `spack env activate` to put everything in an env into your `PATH` (and other env vars).


Spack has significantly higher adoption than EasyBuild overall, though Spack usage has a US bias while EasyBuild is more popular with EU public sector HPC sites.


From the 7th EasyBuild User Meeting (2022), 'A noob test: Spack "vs" EasyBuild' presentation is available on YouTube:

* https://easybuild.io/eum22/#spack-vs-eb

@tgamblin also presented 'The Spack 2022 Roadmap':

* https://easybuild.io/eum22/#spack

And will also present at 8th EUM (April):

* https://easybuild.io/eum/#spack


Better link for Spack talk at EUM'23: (the 'eum' link is a moving target)

There's also an (now a bit outdated) talk compared EasyBuild with Spack and other alternatives like Nix/Guix which I gave a FOSDEM'18: https://archive.fosdem.org/2018/schedule/event/installing_so...

The EasyBuild User Meetings have always been very open to having talks on "competing" tools. Todd (or someone else) giving an update on recent developments in Spack is becoming a tradition (see also https://easybuild.io/eum21/#spack, https://github.com/easybuilders/easybuild/wiki/5th-EasyBuild..., etc.)


Spack recipes are more python-like, it is less tied to LMod, and its concretization algorithm has no real rival. Easybuild supports recursive modules and features to generalize boilerplate.


Another point that I don't think gets emphasized a lot is that research software developers use spack as a dev tool on their laptop/cluster. Setting up easybuild locally is too much effort? If developers maintain spack recipes because they need them, it's much easier for sysadmins to deploy this software with spack too


I don't think it's about effort to set up EasyBuild (although we do have a hard requirement for a modules tool).

Spack is perhaps more attractive to software developers because it has specific features for that use, like the flexible dependency handling mechanism and the concretiser.

In my view, EasyBuild is better suited than Spack to maintain a central software stack, but I'm definitely biased. :)


Does this work with the "environment module" system (either the perl `module` or modern luamod `module`) commonly used in HPC, and what does the interop look like if so?


Yep, it works with Lmod (https://lmod.readthedocs.io) and environment modules (https://modules.sourceforge.net). See the tutorial on setting up modules for what the integration looks like:

* https://spack-tutorial.readthedocs.io/en/latest/tutorial_mod...

AFAIK environment modules are implemented in TCL, not perl. The original was in C with an embedded TCL interpreter, but when CEA revived the project w/version 4 in 2017, it was all in TCL.


This may also be of interest: https://hpc.guix.info/blog/2022/05/back-to-the-future-module...

Guix also has a compatibility layer with environment modules, though you likely wouldn't compose environments that mix in Guix stuff with existing modules. It's intended as an output of Guix.


Anyone tried using this with ROS2? It has similar issue of using LD_LIBRARY_PATH during building that the HEP packages had. Not to mention its own layer on top of cmake.


Does it have an "undo" option?


It doesn't have it, but what sort of undo do you think of? Undo the latest installation on disk? Roll back to the previous environment?


I'd love to have both of these, to be honest.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: