I tried NixOS once and the overall impression was: it's magic. I mean, you know, when you are installing tools on relatively "clean" distro, like Arch, there usually something fails, and you have to try this and that, tweak something, and it is good old "linux way".
And NixOS is something so elegant and beautiful, so mathematically "right", that I subconsciously expect that it wouldn't actually work, and yet it does. I tell it what I want to get (relatively marginal configuration, btw) and it configures installation just right. I break something and it rolls back without a problem.
So the main question I've had ever since: why this stuff isn't popular enough yet.
Nix is nice. However, some practical aspects are (still?) quite ugly. It lacks decoupling between packages and optional runtime dependencies. You can disable optional dependencies, but this would lead to a different package hash negating the use of prebuilt binaries.
Therefore, the culture seems to have all default package builds with all optional dependencies on. This leads to situations such as installing mutt and getting python too! (mutt -> gpgme -> glib -> python)
Last time I checked, if you installed git, you'd also get subversion, etc. Quite sad, given that nix is full of so many fantastic ideas. Hope it matures soon.
I tried NixOS and had a very different experience. It is still a very new project and I encountered some very strange behavior which might be bugs or the result of me not understanding fully what is going on. I think I understand the premise quite well, having learned Haskell to an intermediate level, but in the end things didn't really work as expected.
It would have required a significant time effort from me to understand what is going on and get everything set up the way I want. I was just not willing to put that in at the time. I've gone back to Ubuntu for now.
I am doing more and more Haskell development so I'll take another run at it or, at least, set up nixpkgs on Ubuntu but I will probably give the project some time to mature.
I'm a mechanical/software engineer, but I set up our company's web and continuous integration servers using Ansible. The experience overall was excellent, and everything worked mostly as advertised. A couple pain points I had were:
* If I removed something from the Ansible configuration, it stayed on the server unless I explicitly removed it manually. This created hidden dependencies. I solved this problem by creating a brand new server and running Ansible on it from scratch every so often. I have considered setting up CI for our Ansible configs by using Vagrant to recreate our server architecture, running Ansible on the virtual machines, and ensuring everything works.
* Our continuous integration setup requires Ansible to be installed on the CI server, so it can automatically deploy to staging using the same playbook (Ansible configuration) we use for deployment. Our staging server is the same as our CI server, and it was actually a pain to set up deploying locally as root. Also, I feel like allowing the CI software to use root is a security hole.
I also spent some time with NixOS a year ago, and I was very impressed with how it manages packages. The first problem I mentioned with Ansible seems like it wouldn't happen with NixOS, since not including a package in an environment means it won't be present. Second, it also seems that you could use Nix's declarative configuration language in restricted environments, which wouldn't necessarily require root, instead of having to install system-wide packages for a particular deployment. I am not sure how easy this is in practice.
Currently, I am using Arch Linux, and I installed the nix package manager to play with some more. In the future, we might be provisioning AWS servers in real-time to run simulations given to us by customers (we make simulation software), and in that case I am going to investigate NixOS more.
Most orchestration tools are stateless from run to run, and as such can't identify if a package has been removed.
The solution in my experience is to change the state to absent instead of removing it from the playbook, and then refactor it out at some point in the future.
Yep, this exactly. If you want to make sure something is removed, like from a customer upgrade, leaving it in the configuration for a while is reasonable.
Really I wouldn't want untracked resources to be automatically removed - because someone might have had a reason for installing them.
Though, yes, the question of removal of cruft does come up.
However, if you are doing immutable systems, it's likely you are doing Ansible builds from something like Packer or aminator, and are doing red/green deployments that completely replace the OS.
I think immutable systems are likely the future for applications, they just take some extra work to get your brain around them, that may make them be viewed as "not worth it" in some capacities - those dealing with a lot of data, needing to "stay around", legacy applications, etc.
However, this is also why Docker is popular - because it's providing people a quicker way to get into the immutable system workflow.
People have tried to make a better RPM in the past - I worked at a startup that did this at one point (rPath, now defunct). Though, ultimately, I think people want to work in higher level primatives than package managers, and the use cases involved in migrating a system from version A to version B are often not entirely clearcut.
Configuration is also manageable by things like etcd - I think a new package manager is ... I guess conceptually interesting. But it's kind of the last thing I'd want.
To control what versions you install, it can often be easier to just maintain good software repos, rather than enforcing it on each individual system.
Yes, the general rule of thumb is to always place the state argument in commands so you are explicitly defining what the state should be.
That's how I remove / add users or packages or configuration files. It also gives you a record of what you had on there before too, which has been nice in some cases.
> Most orchestration tools are stateless from run to run, and as such can't identify if a package has been removed.
But NixOS is stateless from update to update as well. The key difference lies in the difference of scope between what is described in a playbook and what is described in a NixOS configuration file.
This kind of discussion seems irrelevant once you move to the immutable deployment model. A new deployment means a new server that the load balancer switches to. Discard the old server. Obviously this doesn't work as well if you are not on the cloud or otherwise not operating in a cloud-friendly (don't write to the filesystem unless it is something like ceph) way. But I have no idea why people are still updating cloud servers instead of replacing them.
Discarding the old server and replacing it with a new one is a very brute force way of dealing with the problem. That's not to say it's bad (it will obviously work exactly as advertised), but I don't see it working well for me. In particular, I want something with a quick turnaround.
I very frequently deploy to a virtualbox VM during development. With NixOS this often takes <10 seconds, and that still feels slow. I cannot imagine that you can do immutable deployment anywhere near as quickly or conveniently (but I'd be excited if you can tell me I'm wrong).
Yes and no. It's true that immutable servers work around the flaws in puppet et al.
However, nix still still really useful even when you have immutable servers. Once the components of an app are described as nix packages you have a lot of flexibility in working with them:
• deploy to a local sandbox for development
• deploy to virtualbox instances for end-to-end testing
• deploy to cloud instances for testing or production
..all from the same package descriptions. I've found it really valuable to have a precisely-defined and exactly reproducible environment for my code, whether in development, testing or production.
deploy to sandbox, virtualbox o cloud instance is just a small change of ip address in Ansible.
This is convenient, but in my point of view it's not really important. The way nixOS provisions new system is great, there is no doubt about that. But the process of making an immutable system to me, it's a bit tricky.
In nixOS, for now you cannot debug a provisioned system without doing it "properly". And it takes time. It would be great if nixOS has debug mode for nix package manager, I think it's not so hard.
For Ansible (and etc ..), you can debug and note it down, then change the playbook, destroy the old one and up new one. It may cost the same time.
I have yet to find a single VPS provider that offers explicit support for NixOS [1]. As much as I like the idea of NixOS, it's difficult to evaluate its utility when I cannot easily spin up a NixOS-powered VPS and kick the tires. There does not seem to be any support for NixOS on DigitalOcean [2], and the instructions I came across for Linode look daunting enough that I am unlikely to even try [3].
Pick any VPS provider offering KVM, and they'll likely offer the ability to insert your own boot CD, from which you can install NixOS. I use DireVPS for a small personal server, for example, although you'll have minor issues installing (you'll need to configure the network from the command line before starting to install, as well as specify the network configuration in the NixOS configuration.nix separately).
If they're running one of those, yes. In general, if a VPS provider doesn't tell you what they're running somewhere (it's usually Xen or OpenVZ, KVM's rarer, Docker's a new platform that doesn't really do the same things), you want to stay away from them.
I've set up NixOS on a Linode, and I was surprised how fast and easy it was, that instantly after nearly instantly running that bash script, I had NixOS set up and was ready to upload my configuration, I didn't even need to have a preconfigured image. Setting it up as a mail server, however, wasn't so much fun.
Package management is great, and Nix is neat, but Nix isn't the entropy-eliminating panacea the author makes it out to be.
Just like every other package manager, from rpm to dpkg, Nix is responsible only for the aspects of the filesystem that it's specified to be responsible for. It's not going to remove garbage left behind by users or poorly-written post-install scripts, and it's not going to automatically undo other non-filesystem-related state changes on package removals. And the practical reality is that not everything on a system can be a native package anyway.
Well, he does mention in the first few paragraphs that "it gets you out of the notion of doing anything manually". The clue here is that Nix systems are meant to be configured only via Nix expressions. Any junk you throw onto the filesystem manually is not part of the concerns of the rest of Nix, because the software packaged by Nix can't even see that junk (when building software or using a nix environment) - it can only see the dependencies specified in the package definitions. The packaged definitions are what matters, because those are the part you redeploy to other systems.
Any system is welcome to add it's own entropy, put it's own junk on - but when you want reproducibility, you do it by specifying how to get that reproducibility in your nix expressions.
One could argue the same for any package-oriented system, from Red Hat to Solaris to Debian/Ubuntu ("only use packages"). The reality is quite different, and it's always been different. I'm unaware of any environment that's used packages for every aspect of system management, and Nix isn't going to change that.
Moreover, there's nothing stopping a Nix package author from writing a post-install script that causes state changes to be made out of bounds. The package manager isn't going to notice, and it's not going to clean it up at deinstallation time.
I think you misunderstand the architecture of Nix. Every package in Nix gets put into an immutable /nix/store. Packages are identified by a hash of their contents, so a small mutation of any package definition gives it a new identity, and thus, a separate package derivation.
Every package in the store has exact dependencies - they reference the hashes of other packages only. When a package is built with Nix, only the directories for these packages are exposed to the chrooted environment in which the build occurs - so it is simply not possible for some randomly added junk in the filesystem to make it's way into a nix-defined package.
Contrast this to building software on another machine, where I might depend on "glibc" version "1.0". The combination these two values hardly represents a unique identity, as I could make any random package that fits those requirements. It's much more difficult for me to create a package which results in an hash collision though.
One thing that makes the other systems so unreliable is the presence of multiple repositories. If you were going to deploy packages from a single repository, then you can do careful planning in such a way that packages do not have any collisions. Assuming no user mutates the directories under control of the package manager, such system will also be effectively reproducible. Current mainstream distros work surprisingly well because they basically use this model, where a default repository provides most users needs. These distros quickly break down when you start adding third-party repositories which bundle alternative compilations of the same software that sits in the "official" one.
Basing packages from hashes (identity), rather than names and numbers, and making sure all of the files for each package is held in an isolated directory ensures that collisions won't happen, even if two different repositories provide the same software name and number, they are represented by different hashes, and will be treated as distinct pieces of software.
I like the architecture of Nix, and believe there are many benefits to it's packaging model. However, I am not a huge fan of the syntax[1], and rather use a template engine I am quite familiar with (like jinja).
I'll continue to follow and learn more about Nix, but anxiously awaiting my epiphany.
Yeah, problem with nix is that it's different. The nix language is a lazy-functional DSL, which takes a bit of effort to learn.
The thing is, "better" does imply "different." The syntax is weird and difficult, but it's the linchpin of the whole system. It's much, much better than your typical packaging system and worth the effort to learn it.
Do you wish nix's syntax were more like jinja? Or do you actually want to use jinja with nix somehow? I don't really know what the latter would mean, since nix is a programming language and jinja is a (string-based) template system.
Nix doesn't have post-install scripts. (NixOS sort-of does, as its global "activation script".) In any case, nothing can force you to do anything - only give you tools to work in the right direction.
Nix gives you tools for dealing with the concept of an immutable system - there's very few other working tools that do the same. Perhaps it can't possibly get all of it right - database setup and schemas, for example - but that's alright, you can use other tools to deal with them! (There's certainly a glut of schema management tools that would fit right into a Nix system.)
In general, you could integrate a tool like Alembic into a NixOS system, by having your NixOS configuration copy your migration files to a persistent directory in the activation script, and running the relevant command to get to a certain revision of your schema specified in your NixOS configuration. That way, you always have all the scripts in a persistent directory to roll back without having to keep them in your git repository until some undefined point in time, and it's tied in reasonably well.
However, NixOS doesn't handle multi-system upgrades very well on its own - for example, you need to have some external way of dealing with "we need to upgrade the schema before restarting the application servers one at a time". I think that's a whole project on its own, and something that almost everything seems to do on an ad-hoc basis (Salt excluded).
This is just what I do. I don't use Alembic, but I've got a really simple tool that handles migrations. The tool, along with its sequence of migrations is deployed along with the app. The systemd service for Postgres runs the tool after the database is up.
Because the service definition has my migration tool as a dependency, it gets rebuilt when the tool is updated. That causes Postgres to get restarted and the tool to get run IFF there are new migrations. The nix configuration is parameterized, so that in dev or staging environments, the database is completely rebuilt and loaded with test data on each deploy, but in production, we just apply migrations.
With Nixops, this works even on a multi-system deployment. The full deployment typically takes several minutes, but most of that time is in installing the updated packages, which doesn't interrupt the running system. Once everything is installed, the cut-over only takes a second or two. Eventually, I'll have enough traffic to warrant more careful orchestration of the cut-over, but for now it's fine.
Please forgive me if I have trouble taking advice from someone who has set up "tens" of servers and doesn't appear to understand the current set of orchestration tools.
Don't get me wrong, Nix sounds great, but this is a poor article from an inexperienced sysadmin who is unable to really point out the pros and cons.
As a guy who is in no way a sysadmin but has devops experience, I thought he nailed it pretty well. Take Ansible. It is declarative: you write down the list of packages you expect to find on the target machine(s), and Ansible will install any that is missing. Now, you remove one of these packages from your playbook. Is it magically uninstalled from the targets? Absolutely not. Because the playbook doesn't describe the "state of the world" in its entirety. A NixOS file does that. Remove postgresql from a master NixOS configuration file? Update and no more Postgres. Because everything that NixOS manages is described (installed software, configuration, etc) (1).
1: Actually, last time I looked at NixOS, you could also install things the imperative way, but that's a silly thing to do.
He did forget to mention Terraform, a tool that suffers from none of the problems he mentioned. And I'm sure there are other DevOps tools that handle state management.
All these tools have to deal with the problem of state. Most of them choose to handle it implicitly...the state of the system is the state and the tool tries to determine it at runtime. This strategy has all the problems listed in the article. Other tools attempt to keep a record of the state. This solves the issues from the article but has it's own set of issues (mostly state management and state corruption). Neither approach is objectively right and both have their tradeoffs.
I honestly agree with the original, down-voted, poster...the article is a bit preachy without much insight. The rise of DevOps has turned system administration into a CS problem. And, like most other CS problems these days, it all boils down to managing state. And just like there are many approaches to state management in application architecture (RDBMS, NoSQL, Paxos/Raft, etc) where no solution is objectively right, there will be many approaches to state management in DevOps where no solution is objectively right. The author obviously has his preferences, but his analysis is biased and incomplete and should be called out as such.
OP here. I've heard of terraform, although I've not investigated it much further than that.
It sounds like it's mostly a provisioning tool, and doesn't really help with configuration management once your machines exist. From the "terraform vs chef / puppet" page:
> Terraform enables any configuration management tool to be used to setup a resource once it has been created. Terraform focuses on the higher-level abstraction of the datacenter and associated services [...]
So it sounds like like a very nice provisioning tool, but doesn't really compete with NixOS itself. Perhaps you could even use it to provision NixOs machines?
You probably want to check out HashiCorp's full stack they recently released dubbed 'Atlas', in which Terraform is one small component: https://atlas.hashicorp.com/
I don't want a tool that does both provisioning and configuration management as I feel the latter should be a real-time concern rather than a deploy-time concern. Using Consul (https://consul.io) and Consul Template (https://github.com/hashicorp/consul-template), we're able to keep configuration centralized, secure and have it deployed automatically every time it changes. And it removes the distinction between configuration changes that are triggered by some event (machine failure, network partition, monitoring, auto-scaling, etc), changes that are triggered by a developer commit and changes that operations wants to make (maintenance, DDoS response, etc). Terraform provisions all of that and then configuration management happens on an ongoing basis.
I'm really not trying to sound like a shill for Hashicorp, but we use a bunch of their tools and find them to be, overall, very worthwhile and focused on accomplishing a single logical task which makes them easily composable with tools from other vendors. I also don't want it to sound like I'm criticizing Nix or NixOS...they sound like excellent tools. My only point was that there are other ways to solve the problems expressed in the posting and that each solution has tradeoffs that DevOps needs to consider when designing infrastructure. Your blog struck me as being a strawman criticism of somewhat dated tools without consideration for newer options, especially since your discussion of Docker was so narrowly focused on the actual Docker tool without any consideration given for Fleet, Swarm, ECS or any of the host of orchestration options in the Docker ecosystem.
If you'd written it more from a position of "here's how NixOS has made my life easier," you'd probably find that people would be more receptive to it. But, instead, it had a "here's why NixOS is better than the alternatives" feel to it which is going to rub people the wrong way when it's pretty clear that you're not aware of all the alternatives. NixOS is one good option, but it's by no means the only good option.
> Perhaps you could even use it [Terraform] to provision NixOs machines?
You definitely can, so long as your servers are virtualized. Terraform is significantly less useful in a bare-metal world. However Terraform is really about provisioning specific machines. For example, you might write Terraform to provision 1 SMTP server, 3 web front-ends behind a load balancer and 2 database hosts. But it's pretty crude at doing provision-time tasks...it basically allows you to run shell commands. Where you'd do the bulk of your provisioning would be in a tool like Packer, Aminator or other such tool that creates VM images that can be deployed. That's where you'd start with a base NixOs image and then declare what's installed on an SMTP server, a web front-end and a database server. Terraform would just reference those images and size the machines.
Fair points. I started the post by stating that I (personally) wanted to use NixOS in the future, but admittedly didn't maintain that tone throughout the piece.
I definitely have humble requirements in terms of deployment size, so (for me) any orchestration tool is likely to be way more effort than it's worth. I compared NixOS (not a deployment tool) to those other (small-scale deployment and/or configuration management tools) because that's what I and plenty of developers I know have used, and I think the comparison helps illustrate the issues that NixOS can solve. Hopefully those who _do_ have experience and need for larger orchestration software can tell from reading whether the problems NixOS solves are relevant to them.
> Actually, last time I looked at NixOS, you could also install things the imperative way, but that's a silly thing to do.
Actually not, because if you do, you will find these changes added to a file declaratively describing all the changes you have committed to the system (or user-context) as a whole.
It's a good way to build a config if you're not fully comfortable with nix-syntax.
I don't remember where the file ends up, but it is there.
(I'll admit that it doesn't address the point you were trying to raise rather ungraciously, namely, how well NixOS deployment tooling scale to really large deployments, which is a good question to ask).
That's a fair question. I'll let you know if I find out ;)
My hunch is that it could scale well, but would require some integration work in order to use it nicely with existing orchestration tools. I don't know exactly what that would look like, or if it would be worth the effort of going off the beaten track.
Yeah that's what I was doing for almost 5 years before I found Ansible. Now I even don't bother to put a CD. I go to my VPS provider and add a VPS then add my key and run an ansible playbook, which installs whatever I want, the way I wanted it.
The thing with package managers is that it get's boring to do everything all over again for a large amount of servers. Also your brain memory does not have a reliable version control. You need to keep why and how you do certain stuff on your server somewhere else. First thing that comes to your mind is a shell script and people are already using something better ( ansible, puppet, etc. ). Now, with this article, I see that there is something even better than this.
It's not just about installing a few packages with the package manager. You have system specific configuration, ranging from services to users to file systems and more, sometimes at odds with the state of a security patched/updated machine at any point in time. The way you configured the state of a machine 12 months ago might not be reflected when you build, update, and configure a machine today. This disconnect between installs gets compounded when you have to randomly log in and make minor tweaks to keep up with your ever-changing application specs (tweaks which rarely get documented), causing you or your successor to scratch their head wondering what steps are missing compared to before when doing a new build. This gets even more complicated when you build large clusters meant to scale, and can't afford to spend the time it takes to individually configure each machine. Orchestration aims at fixing this by giving you a way to bring the state of a freshly installed machine to the correct state of your production environment in an (ideally) consistent and repeatable manner, which is the whole point of all these fancy new products we're seeing these days.
One reason is because there is more than one of you.
By using Puppet, I never have to read my colleagues custom setup script because store it there. By checking in everything to version control, I see who has done what by reading the changelog.
If you are only managing your own servers and keep a strict discipline you will see less of the benefits.
Actually, the point is not that Nix is bad (though there are reports of packages installing things you didn't want because someone thought a dependency of the root package might need it). It's that the author doesn't have enough experience as a sysamin to speak on Nix actual strengths and weaknesses when compared to tools they don't appear to understand very well.
This article is fluff because of this lack of experience; it honestly reads as another "looks at this cool thing I know nothing about!"
Ok, if it's fluff, do tell us what issues he overlooks. If he's made a mistake about how Puppet or Ansible work, point them out. In other words, criticize what he said, not his pedigree.
And NixOS is something so elegant and beautiful, so mathematically "right", that I subconsciously expect that it wouldn't actually work, and yet it does. I tell it what I want to get (relatively marginal configuration, btw) and it configures installation just right. I break something and it rolls back without a problem.
So the main question I've had ever since: why this stuff isn't popular enough yet.