Hacker News new | past | comments | ask | show | jobs | submit login
Don’t use environment variables for configuration (nibblestew.blogspot.com)
140 points by ingve on April 1, 2021 | hide | past | favorite | 286 comments



I disagree with this post so strongly - having spent most of my career installing, configuring, and managing other people’s software.

> The answer is that you, the end user, can not now. Every program is free to do its own thing and most do. If you have ever spent ages wondering why the exact same commands work when run from one terminal but not the other, this is probably why.

If the same program is behaving differently between two systems - it can -only- be the environment that’s different.

> Instead of coming up with a syntax that is as good as possible for the given problem, instead the goal is to produce syntax that is easy to use

Oh to be a developer. The “best possible syntax” is a universe of possibilities - environment variables are, thankfully, limited to strings. If all programs had to be configured with a Turing complete config language - that would just be a programming language! Limitations can set you free.

Sorry for the harsh tone. Please, and I believe I speak for most sysadmins, please continue to use environment variables.


This is a voice of reason, not harshness.

One of the projects I currently work on, the configuration system/model/table is a monster.. I wish it was just strings. It basically contains boolean flags, strings, numbers etc but the most insidious one is, it can contain groups of related configs - meaning people started dumping stuff into that should be a normal table (ex, ShippingType:, Road, Rail, Air), so we get no foreign key constraints for reference data. This caused them to implement soft-deletes for it, so now you have some config values that float around forever because they were referenced somewhere (aka pseudo-foreignkey). It's so utterly dumb I want delete the whole thing but everyone thinks it works great (non-tech people). I'm a developer, not a system-admin, but this is too much. Doesn't help that 5 different people has added 'features' to it over the years. The same can be accomplished with something waaaay simpler/cleaner. We also keep having config-related issues where people blame the system/servers/devops etc... every single time it is due to misconfig the project, so the system admins cannot do anything about it anyways (meaning we keep wasting their time, thinking the problem was with the servers). Let me pause here, need to take a blood pressure pill.


> The “best possible syntax” is a universe of possibilities - environment variables are, thankfully, limited to strings.

The same can be said for command line arguments. I usually prefer those instead of environment variables, because they must be explicitly specified instead of being implicitly passed by the parent process. I think of env vars as being more useful when repetitively calling commands interactively, to save some typing, or when you really do want processes to inherit config from parents (like PATH and its variants).

But overall I still agree with your sentiment.


100% this. The developer behind Prometheus was a huge dick to people about env vars a while back, in similar fashion. Just the other day, they held another closed-doors vote after a year or so and finally decided they were OK.

Doesn't surprise me the creator of Meson of all people made the same dogmatic assertion.

What a circus this industry has become.


Any link about that? Are you talking about this? https://github.com/prometheus/prometheus/issues/6047#issueco...


It's fascinating how you can do this job for decades and learn about new tools daily. And I mean tools that are here for ages.

I just learned about envsubst https://www.gnu.org/software/gettext/manual/html_node/envsub...


This was the original discussion that spanned quite a long time. https://github.com/prometheus/prometheus/issues/2357


I came here for this comment. I'm a developer and I have been using env vars for quite some time after I got burned numerous times by other options. I just can't see why I would use anything else.


I have no skin in this game, so to speak, so just curios.

What kind of ways did you get burned by other things than env vars in a way in which env vars would not?


Mostly configuration stored in the database (who configures the database) or 3rd party configuration services without an SLA and config files that are either present or not. Env vars are really simple since they are completely decoupled from the app and you can have default values for all of them. You just need a single class that loads all config on startup and you can go from there (or fail if you don't have the mandatory vars).


>who configures the database

or, what configures the database connection


I forgot the question mark: (who configures the database?). So it is about the problem that if you have your configs in a database then you will carry the additional burden to separately set up databases for each environment....it is like trying to put the hash of an image on the image.


They also work well with serverless environments, containers, etc., which can’t be said for some alternatives.


Environment variables are also more portable and cheaper/simpler than a DBMS, a LDAP service, an application server's proprietary configuration repository, etc. without being weaker at specifying simple configuration values.

In practical terms, specifying environment variables in cleanly isolated and composable layers (defaults by user, a specific terminal session, a script that call another script or the useful program) is a major advantage over more enterprisey and monolithic mechanisms.


Bingo.

The author's title is unfortunate: "Never use environment variables for configuration"

Not everyone is writing CLI scripts. Some are writing multi-environment software for the web. Some people care about git and distributed teams.

Let's say you backup and migrate a db which has config in the db. Well your other environment is using the wrong config! Now you're possibly using prod SMTP credentials and sending notifications to the wrong people because all you wanted to do was have live content and do some testing or show debug messages because you're on a test environment.

Web frameworks which have the HTTP host in the database drive me nuts. Why do you need that? Your webserver is responding on a domain name. Why do some store the absolute url in the db. It makes no sense. For the 3 people that want to serve up domainA.com but have all links readily be domainB.com maybe that makes sense.

Don't get burned people. For most things, just ignore this article, keep your secrets out of version control and keep your code ready to deploy to multiple environments. Decouple that. Your app should be able to work with various configurable services (SMTP, push notifications, database, etc) and you don't want that config in version control.

So either a file that's outside of version control with variables you set per environment or actual server environment variables.

Environment variables in many web languages are just namespaced globals. Still better than global variables. They serve a purpose.

The author has this:

int first_argument;

int second_argument;

void add_numbers(void) { return first_argument + second_argument; }

While it helps the author's agenda, that's not a legitimate example.

A real example would be a service provider which a developer would understand to have the sole purpose of pulling from environment or config values to initialize.

You wanted a Twilio client? Well, we know it needs some keys. Use environment variables. Boom. Everywhere you ask for this Twilio client you get the same instance with the config that that environment needs.

This is not a problem. It works well. Better than any alternative.

Author also says this "Environment variables is exactly this: mutable global state."

For some CLI applications maybe. But for web languages this is not accurate. Some languages, yes, you can mutate the env variables at runtime. But your code shouldn't rely on any mutable env vars. You can load those env vars into a config. Even cache it. Not allow mutations. Many modern frameworks support this or you could implement it yourself.

And if you have a CLI that really needs to be explicit about the environment vars to run? And not be different the next time you call it with no env vars? Well maybe take those as arguments?

Plenty of CLI apps that store config in json and have a wizard to set those credentials. AWS CLI as an example.

Now you got JSON as the author wanted. That can work across platforms. But guess what? You work on multiple clients and hosts and next time you call it? Well you might not be expecting that it had config for another client or app.

Same problem. The state was mutated. You called it again. You potentially got burned. Now you have to do the CLI wizard to reconfigure. Or you pass in explicit params if allowed.

Definitely a consideration if you are writing such a program. Do you make it explicit with arguments and options? Load from ENV vars? Have some CLI wizard and save to JSON?

The author's suggestion of JSON is no reason to toss out ENV vars. Just solves slight differences between Windows and Unix. Which is why you can program a CLI wizard if you care about that problem.

We don't need to say no to env vars because we want CLI app users across different platforms to have the exact same API. Setting up a JSON file is really lame to use a program. Which is why you see cli wizards when you run them.

Or things like "aws configure". And you still have CLI arguments and options availabe.


I left the question mark out. It is "(who configures the database?)".


> If the same program is behaving differently between two systems - it can -only- be the environment that’s different.

Yes, but the entire problem is that "the environment" is massive as it includes all hardware and software running on the device in question (and quite possible other devices as the network can easily be considered part of "the environment") which makes it difficult to track down differing behaviour.

"The environment" is not just environment variables. I've run into a spreadsheet bug where I got wrong results because of a CPU bug. Just because some global mutable state exists, that doesn't mean it's a good software design to have program behaviour depend on it.


>If the same program is behaving differently between two systems - it can -only- be the environment that’s different.

As a developer, I've generally used configuration files for changing the operation of my software and environment variables for information my code needs to know about WHERE the code is running. For example, the same code doing the exact same thing on 10 machines would have the same configuration file (or command line parameters in simpler cases) but the environment variables may change from machine to machine.


I think the post has some merit if we differentiate strongly between what is truly external to the code.

The article's point stands if we're being lame and treating internal code details as external.


> Oh to be a developer. The “best possible syntax” is a universe of possibilities - environment variables are, thankfully, limited to strings. If all programs had to be configured with a Turing complete config language - that would just be a programming language! Limitations can set you free.

I was thinking about this while reading the post. What's the best syntax language or syntax for configuration, if environment variables are too simplistic and full-fledged programming languages are too powerful? That's why I'm interested in Dhall, which is a configuration language that's limited to compile-time only–i.e. it can't do anything at runtime.


There was quite a vogue for using programming languages to configure things for a while. It still has its place I think.


I feel one good argument against envars is loggers might log them, but Ive never been bitten by that myself.


Chromium with --v=1 will indeed spit out the API keys it's been configured with, and the debug output by default gets logged into a file.


This attitude is how we ended up with Active Directory. One assumes that the author doesn't spend much time in a command-line environment. The option to set defaults for one's normal working environment is obviously of value, and environment variables are hardly exclusive of configuration files (~/.bashrc, makefiles, ...). The theory that complex syntax and statelessness are inherently good and cost-free is naïve to the point of parody.

The author is correct that "this is the way we have always done it" isn't a good argument in and of itself to persist in a practice. However, they might be rewarded by a few minutes pondering a related idea: "if generations of people---many quite capable of modifying the system to use something else---persist in using something, it's possible they have a reason for doing so other than a deficit in competence or imagination."


On the facade environment variables may seem like they're orthogonal to global variables but they're not.

Environment variables are scoped to the current process. This could be your shell, but it could also be a web server. This doesn't make them leak proof, but unlike global variables, environment variables have a scope.

Environment variables are also used more widely as an API. Many CLIs have a command that when self-executed can act as a persistence layer in your shell. This functionality would be impossible without environment variables.

You mentioned the other motherload, which is that environment variables are quite often, again, used to communicate in Makefiles. You can see the natural scoping if you start to kick off an ad-hoc shell process inside a Make target.


I don't get this. From the same perspective you can argue global variables have scope too since they are "scoped to the current process". It's not like other processes can access your global variable, that's a very low bar for a scope.


The difference here is that a running shell (including the shell environments that services run with in) are an abstraction layer about managing processes. Within that abstraction layer, each process can have its own unique environment variables that they can change and manipulate independently. That makes them not global.

Within a process the top level abstraction is the process itself, and anything underneath (class, method, function) will be impacted if another sub-abstraction makes a change to a global variable.


Makes sense, thanks.


> It's not like other processes can access your global variable, that's a very low bar for a scope.

They can read them, in /proc/[PID]/environ


Interestingly, reading this blog post, this doesn't seem to be common knowledge. The first comparison of global variables inside a process and environment variables left me wondering. It just felt wrong.

A gripe I have with environment variables is when they are used to modify a programs behavior deep inside it's belly and aren't treated like configuration input similar to program arguments.

Otherwise they are a universal way of configuring applications. Universal is good, universal is nice.


> Environment variables are scoped to the current process.

Obviously this is true in a sense. But for practical purposes, it depends on how the environment variables are set. For example, if I set them in my ~/.bash_profile, they're scoped to all bash processes that my current user runs. If I put them in /etc/profile, they're effectively global.


Depending on the variable, sure, but that's a wide latitude in the interpretation of "global". They're not linked or the same memory in any form or fashion. They're merely identical in value.


True. I was thinking in the context of software deployments where environment variables are commonly used as read-only configuration values once the deployment is made.


@ajarmst is that last related idea you quoted your own? That’s a powerful expression of how I see human culture: question everything, but remember to respect the ideas of the people who came before you. There might be a baby in the bath water you’re discarding.


It's a restatement of the lesson of Chesterton's Fence.

There is a fence somewhere you wish to get rid of. Nobody around knows why it was there in the first place. No one will let you tear it down until somebody figures out why it was put there in the first place.

The important take away, is that before changing something, one should understand the history of a thing, or one runs the risk of being a victim of the same problem the unknown thing was put in place to address.

Life is way more complicated than the umwelt of any one individual, so it is not safe to just change something without doing the footwork to understand what led it ro be in the first place.

Demonstrate that work has been done, and generally, no one will get in your way.


The sentiment certainly isn’t original. The phrasing sounds like me, though. I can get pretty pedantic.



What do you mean about Active Directory?


Strongly disagree. Environment variables are, IMHO, best tool for some simple configuration in unix. They match perfectly with behavior of the ecosystem and other tools in it (like unix shell).

Yes, if your OS is some unversal JS machine, then JSON would be better, if it is Lisp machine, then you would use S-expressions, but on Unix machine, environment/args are way to go.

There are two realistic alternatives - config files and arguments. They have each their own niche, where environment is somehwere between them.

Arguments are better for one-shot setting, not for some setting used always. You can use 'alias' to define shortcuts that always add some argument, but that is definitely more cumbersome.

Config files are good for always/default setting, but are too rigid. Changing config files is equivalent of changing global variable in code, it has system/user-wide effect. While i can just change environment in this one shell and it will affect just commands executed from that shell. Also, config files are much harded to be manipulated from scripts, and use different syntax for each tool.

Perhaps the ideal tool would allow every option to be set/changed from config file, environment and argument.


Why I don't like environment variables:

1. I worry about programs dumping all their environment variables to log files - credentials are now on disk, ingested into log storage...

2. Environment variables are inherited by child processes by default. This is undoubtable useful. But it can also cause problems.

I wish the ghosts of unix past had forseen the need for a way to mark particular variables as, say, 'sensitive' and 'noexport', allowing them to opt out of the default behaviour.

It would have been so say to say "variables starting with _ are not inherited and should be censored when output", but we're about 40 years too late for that to catch on...


I used to agree with (2), but now I think Meh, it's an implementation detail whether the program uses my environment variable 'directly' or with a child process, it's not meaningful to make that distinction.

When it is meaningful (and this is supported today) is to set them just for specific programs/invocations, rather than exporting for a long-running interactive shell (and everything within it) willy-nilly.

More innovation around making that easier would be interesting, env vars that should be set specified by program, isolated from others, for example. So `foobar` would actually get executed like `FOO_SECRET=hunter2 foobar` without specifying it every time or having it exported in the shell, and in a generic way not specific to each program's config.

It's not really related but for some reason 'summon' is on my mind as a tool to mention. I haven't used it in anger yet, but it is interesting. It's not quite this though, or at least, it solves only the 'storage' part of the implementation of what I described, not the 'orchestration' or mapping of programs to vars/summon invocations.


This is pretty much how systemd works. You can specify secrets that are retrieved from somewhere else and provided to the process in the environment it is started with. So you could do exactly this with the right unit configurations.


Ha, funnily enough I mentioned systemd and then deleted it. I do run as much like that as possible, I just couldn't succinctly explain why I thought it was different or better than putting:

    VAR=whatever process
in .xinitrc or wherever.


Iiuc you are advocating for setting env vars at the call site, like `FOO_SECRET=hunter2 foobar`? In that case, why not just use command-line args and call it like `foobar --secret=hunter2`?


I wasn't advocating for it in preference to args, but there are circumstances where that's not possible, for example calling some CI/CD tool (say terraform, ansible, fabric, whatever) that doesn't consume the var itself but uses something that does.

It's also a more convenient/already generic interface for doing something consistent across multiple programs.


Because in the latter case the commandline of the executed process (which may be exposed in various places, including a simple process list) is `foobar --secret=hunter2` and in the former case it's just `foobar`.


My first thought on the headline are specialized concerns of the above: environment variables are an attack surface. If you use them for configuration, it's all too easy for an attacker to modify them without the victim knowing. Just look at issues with LD_PRELOAD: https://attack.mitre.org/techniques/T1574/006/

That said, I agree with GP that environment variables are super useful and super simple. But I've also been burned more than a couple of times by setting something in the past and then having it caused unexpected bugs that are hard to trace down as they aren't in my working memory. They're a double-edged sword, to be sure.


There's actually a long list of variables that are unset when invoking sudo to prevent these kinds of attacks. Systemd will also start programs with a very minimal environment that isn't inherited from any shells. You then have to specify environment variables explicitly as part of the unit file. You can also specify environment variables in environment files.


1. If your program is chatty, it can be chatty in the same way regardless of where the improperly logged secrets come from; it's still your fault for being coarse and lazy. There's little difference between logging all environment variables(and/or all command line parameters) and logging the whole configuration object.

2. If your child processes shouldn't inherit environment variables, set them properly. The "ghosts of Unix past" have "foreseen the need" for execve(2) and execveat(2), which don't pass anything by "default".


> I wish the ghosts of unix past had forseen the need for a way to mark particular variables as, say, 'sensitive' and 'noexport', allowing them to opt out of the default behaviour.

The default behavior is a non-exported variable. If you want child processes to see it, you must export it.


There is no such thing as an exported or non-exported environment variable. In fact, as the kernel is concerned, there is no such thing as an environment _variable_ at all, just a block of data.

See execve(2):

"envp is an array of pointers to strings, conventionally of the form key=value, which are passed as the environment of the new program. The envp array must be terminated by a NULL pointer."

You can confirm this my examining the the environment block that was passed in to your current shell with:

    < /proc/$$/environ tr '\0' '\n'
What you're referring to are actually "shell parameters", some of which may be marked for export. When the shell starts up, it parses the environment block and sets parameters based on what it finds, marking them all for export. And the shell uses only the parameters marked for export when constructing the environment block for a child process (which is passed to execve(2)/execveat(2) in the envp argument).


I guess I assumed you were referring to environment variables in the context of the shell. Apparently I was incorrect.


Isn't storing credentials in environmental variables bad practice to begin with?


What better place is there to store credentials?


A config store like Vault. Of course, that needs credentials too, which are typically a file on the file system.

IMO, people are overly sensitive about environment vars. They are really no worse than files on the file system - both can be accessed if you're a privileged user on that machine.


Vault should be source of those env variables. Via some predefined initcontainer or something like that, to which devs don't have access to.


Or you could, you know, auth to vault and pull the creds from vault inside of your app?


You could, but then you’ll have replaced a universal and standardized abstraction with a hard commitment to one very specific approach. That doesn’t come cheap.


One thing that I like, which this approach allows for, is live configuration. For things like databases and such which allow for the regular rolling of credentials.

It's not simple by itself, but it simplifies other things.


How do you auth to vault?


Via either program config files, an inline subshell calling cat, or ssh-agent in that specific case, to keep credentials both out of the environment, and off of the command-line where it can be read by inspecting the resulting process for it's invocation.


All of those places can also be read.

SSH agent is a good example. It’s effectively an environment var which is why this works fine:

  sudo SSH_AUTH_SOCK=$SSH_AUTH_SOCK git clone ...
Edit:

The reason I think it’s silly to make a blanket statement environment vars are bad is because too many containers have credentials baked into the image when they should be passed in another way.


You can disable access the possibility to read the memory of other processes and you can do it for environment variables. Storing access tokens in memory is more obscure than environment variables, that is true though.


Store a path to the top secret file in an environment variable, and have the program read the credentials out of the file. Put the file somewhere far away from the repo, on the deployed filesystem.


Yes, for those reasons


Compared to what?


depends on who you are talking to, since you run the risk of committing the creds to a git repo


How do command line arguments or config files solve either problem?


Command line arguments aren't inherited by child processes. Unfortunately they are visible to other users on the system, so they're no good for credentials.

Config files (or an abstraction of them such as reading config data from a socket), after parsing, result in some credentials sitting in the memory of the process that needs them. They aren't in the environment block, so a quick and dirty "dump all my environment variables to stdout" procedure won't risk exposing them. And for the same reason, a child process that does the same won't inherit them in order to expose them.

Note that talking about preventing accidental exposure of credentials. Config files alone can't protect credentials from a malicious process that deliberately goes looking for them to leak them; for that additional measures have to be taken... but environment variables aren't part of the solution!


It's also easier to encrypt a config file and provide the decrypt key externally.


So then the real secret is the key, which is provided "externally" ... how? Through command line parameters, some other config file, or environment variables? :P


Another encrypted config file. Obviously.


Encrypted files all the way down.


A file that’s protected appropriately or standard input.


> Unfortunately they are visible to other users on the system, so they're no good for credentials.

If you’re concerned about your own software logging credentials, command line arguments are negative in two regards:

They’re highly visible when the process is running; they’re often automatically logged.

> They aren't in the environment block, so a quick and dirty "dump all my environment variables to stdout" procedure won't risk exposing them.

Okay — but the usual way that happens is “dump my config object in a log”, which parsed configs don’t help with.

You also now have a config file: how is it stored? ...is it in the repo? ...what are the permissions? ...how do we deploy it?

Environment variables don’t persist in repos and are designed to be integrated with hosting tools, like secrets managers.

I’m not seeing how a config file beats Kubernetes injecting from the secret store, which is why we use environment variables: so our tools (secret stores) can configure the environment our software uses.


Good point about command line arguments being often automatically logged! So they bad for both reasons :)

Now, if you're running in k8s then you can improve your setup by mounting your secret into your container, and have your code read the credentials from the file within the mount. This just looks like another kind of config file to me :)


Embedding secrets (which should be changeable and with limited access) into container images (which should be reproducible and perhaps stored in accessible locations) sounds like not a goood idea; IMHO you definitely need the capability to have the same container use different credentials so that, for example, you can run the same container in a development or testing environment as in production, but with different credentials.



I believe they meant mounting via Kubectrl as a file, from the secret manager.

So runtime file injection.


> And for the same reason, a child process that does the same won't inherit them in order to expose them.

Wait, this is the crux of why you think it’s more secure — but actually I see the reverse problem:

Dropping environmental variables is standard security practice, but dropping file access permissions is not. Most child processes read from the same set of files as the parents.

How would having files rather than ENVs make my container more secure, where we’re concerned with developers making mistakes (passing ENV vs passing file permissions)?

Similarly, the only proposed benefit of your idea is we don’t have them around post reading — but that’s true if you initialize a config object from ENV and then pass it around as well. (You ignored my point about how mistakes via logging happen.)


I'm concerned with 'I have credentials in my environment block and just dumped the whole thing to a log file'. Avoiding storing the credentials in the environment block obviously avoids that.

To be fair, unsetting sensitive environment variables after consuming them would probably also avoid that eventuality. I can count on the fingers of no hands the number of times I've seen developers do that! :)

Some other part of my process (or a child process I might launch) deliberately hunting for credentials in order to leak them is a different problem with other solutions.

In between these two cases we have mistakes like "dump config object (containing credentials) to a log file". That, too, can happen and should be avoided, what more can I say?


> You also now have a config file: how is it stored?

Hum... Your environment variables must be stored at some place too, so the server can be launched. You can store the files at the exact same place.


Sure — you throw them in the Kube secret manager.

But now you have multiple config files (smart) or your entire config outside the repo (not smart). This isn’t always the wrong approach — SSH keys get loaded this way, for instance.

ENV variables naturally provide a way to layer content from different sources in a way that files don’t, so if you have a relatively simple config from multiple providers (eg, getting AWS session token from the host plus your environment config from the launch ENVs) it’s easier to use the K-V store nature of ENV variables versus multiple files.

Again, multiple config files isn’t always wrong — but using that to store single strings instead of ENV variables is a code smell, for sure.


> Perhaps the ideal tool would allow every option to be set/changed from config file, environment and argument.

This is exactly what the most widely used golang configuration library does: https://github.com/spf13/viper


Its also what most of the entrprisey frameworks do. Spring will do this and I'm pretty sure ASP.NET has some form of it.


I made something similar for Python, with an animal theme too, ha:

https://pypi.org/project/tconf/


I was taught many moons ago that configuration, like ogres and onions, is best considered in layers:

1. default values: What will most users in most places find most useful/least infuriating?

2. configuration files (system-wide, then user): What will most users on this system want most of the time? What will this particular user want most of the time?

3. environment variables: How should this session (i.e., a potentially large series of related executions) be tailored?

4. command line options: What is most useful for this particular run?

I was also taught that:

- figuring out how to go from an option to the name of a corresponding environment variable to a line in a config file should be both straightforward and well documented; and

- sometimes you need a more complex configuration than is cleanly supportable through any other method than a file. In such a case, the location of that file can itself be passed through options and the environment.


This is precisely how I setup my utilities. I have found following practices useful:

1. Print the path of the system and user level configurations that the utility honours in the help text (-h/-?)

2. For an option that can be set interactively or via environment variable, specify the environment variable name in the help text itself to provide maximum choice to the user.

3. Provide a -viewconfig option that prints out the final resolved configuration state so that the user can see the actual configuration that is in effect. Combined with a -dryrun option, this can provide a lot of confidence to the user to try out things without breaking anything.


My guess for this is that some people have not had the good fortune of seeing software that followed this pattern and how nice it is.

I thought it was common knowlegse that if really wanted to do configuration right on a given project, you do all 4 (with some library support) and you write your code to gracefully handle the right piece of configuration from the appropriate "override level" (again usually with the support of a good library).

See also: Domain Driven Design[0] which (if you ignore the consultant-fodder and jargon that comes with it) is probably one of the best written guides of how you should abstract systems, just like the gang of four book is a good introduction to structures in program/algorithm implementation you're likely to see in real life.

[0]: https://en.wikipedia.org/wiki/Domain-driven_design


Yeah, the article is just confused:

> Envvars have some legitimate usages (such as enabling debug logging) but they should never, ever be used for configuring core functionality of programs.

As though logging weren't core functionality!

The actual thing that is bad is grabbing an environment variable in the middle of your program. You should grab all the configuration in one place and use it to configure local state that is transparently passed around. Furthermore, flags, env vars, and config files are all just maps from strings to configuration, so you should use some system that can transparently layer them on top of one another. All of my new CLIs use flags first and fall back to ENV vars if the flag wasn't set.


I like layered config as well. It really should be the default way of thinking about config. Our custom application framework handles that exact sort of layering and it's wonderful. I give it an annotated class representing the config that I need and it handles populating the fields from the config and generating the help message if something is missing.


This echos nicely the traditional wisdom, also described here: http://www.catb.org/~esr/writings/taoup/html/ch10s02.html


(for default values, make sure you consider safety too! for example a debug option that might show PII is likely to be most useful to most people using the program, but shouldn't default to on because if it were on in prod the consequences would be serious)


yeah, i don't get too worked up about "how" config values enter the application as long as i can easily see "where" they are initialized/validated.

an immutable config object/class created on startup that reads files/env vars/whatever and has appropriate assertions to ensure good values were used and crashes the app for missing/bad values usually keeps things sane.

an app where each subcomponent has its own config that it gets in its own way usually leads to confusion and delay


I have a pet peeve about this attitude. These methods are being used for decades and well understood with all their advantages and disadvantages.

One day, someone comes and tells that it's bad and considered harmful, and happily tells the only right way to do it. A flame war ensues then.

I'm all for moving things forward and evolution, but can't we take a milder stance and move forward in a more peaceful way? Attacking something so well established because of personal reasons feels so wrong from my PoV.

That thing wouldn't be a de-facto standard if it was too bad, right? I think we shouldn't play with the foundation that much.


> That thing wouldn't be a de-facto standard if it was too bad, right?

I disagree. Very often, the 'easiest' option wins, not the one that necessarily is the 'best', especially in the long term. Environment variables for configurations are fast and easy, and work well on the happy path - but break down quite fast when dealing with more complex cases (eg. more complex types than strings). They have a treacherous way of seeming the simplest and most pragmatic solution at first, but becoming a an untyped, underdocumented hairball after some time in more complex software.

Furthermore, as long as environments have existed in UNIX and UNIX-like derivatives, their usage to configure the bulk of the behaviour of most services/programs are relatively new. The more old-school the service you deploy somewhere, the more likely it has a file-driven config. Indeed, sometimes it seems like 90% of the Docker code out there is converting environment variables into configuration files.


> Indeed, sometimes it seems like 90% of the Docker code out there is converting environment variables into configuration files.

This is a consequence of Docker's choice of the "image" as an abstraction layer. It's not trivial to say "run this image but with this config file added" (yes you could bind mount one in, or create a new derived image, but those are both harder and come with more pitfalls).

In most common docker usage, there are exactly two ways to influence the operation of the program contained within the image: Environment variables, and command line arguments.


For automation you will store them in a file anyway, but then how is it different from a bind mount?


More usually, in k8s configmap.


> I disagree. Very often, the 'easiest' option wins, not the one that necessarily is the 'best', especially in the long term.

Thank you for your disagreement and discussion, honestly. Actually, I think using environment variables are a burden. Needs more documentation, more explicit warnings, a lot of handling, etc.

So, environment variables are not the easiest way out there. Especially when almost any programming language has nice config file libraries out of the box. Instead these variables are added as a convenience feature for some frequent scenarios where tool needs to adapt itself to the environment it needs to run in, just before starting or needs to be run repeatedly with small, transient changes to the config.

> Furthermore, as long as environments have existed in UNIX and UNIX-like derivatives, their usage to configure the bulk of the behaviour of a service/program are relatively new.

This is not what I see in my career. Bulk of the applications we installed and ran used some forms of environment variables for runtime configuration of the tool/application.

The reason for that the variable had a great deal of effect in the behavior of the program (which was generally scientific) and making multiple runs without modifying a file very effective. You need these runs to conduct research BTW, and you're on a cluster and jobs run long and whatnot.

TBH, most of these applications also had configuration files or "sensible defaults" and they either created their default files if there was none. And if there was a file, the environment variable was acting as an override.

So I had experimental software, fixed most of the parameters in the file and tried some other things by overriding some parameter(s) with an environment variable. Nothing was abused or misused.

> Indeed, sometimes it seems like 90% of the Docker code out there is converting environment variables into configuration files.

I've never seen it TBH, and if that's not documented well, it would be a big bag of fun for the users of that code.


> This is not what I see in my career. Bulk of the applications we installed and ran used some forms of environment variables for runtime configuration of the tool/application.

I think we might have different backgrounds and considerations as to what counts as 'oldschool'? Maybe I shouldn't have extrapolated this to pre-2000... So, my experience comes from working with the following 'mood' of services:

  Postfix, Exim, qmail, slapd, PostgreSQL, MySQL, FreeRADIUS, Apache, Nginx, ...
All of which have their own config file/files, format, etc. All of these system-wide services, not user applications. And I think that's the main difference? I tend to deal with software that is deployed in isolated environments, be it by root users on production server, or by whoever in a containerized environment. And not deployed on an interactive systems, to then be started/reconfigured by users running on the same system.

> I've never seen it TBH, and if that's not documented well, it would be a big bag of fun for the users of that code.

Check out the list above on Dockerhub. I'm not sure all of them are dockerized in this way, but at least a handful of them are.


I've further clarified my PoV here [0], but it won't hurt to reiterate. I'd be happy in fact.

> I think we might have different backgrounds and considerations as to what counts as 'oldschool'?

Most probably. All of the software you mentioned (maybe except Exim4) is actively used in our environments, quite a few of them are in very vanilla configs, and some of them are customized to the point of abuse. However, it's worth mentioning that all of the software you mentioned are in support roles in our scenario, they're the so-called side dish which we configure once and leave alone for a very long time.

> Maybe I shouldn't have extrapolated this to pre-2000...

I've started with a C64, please. :)

> All of these system-wide services, not user applications. And I think that's the main difference?

Yes, the tools I've talked about are userspace programs, and are not daemons 99.999% of the time. So you need to run it many times with small differences, and reconfiguring/regenerating file is a lot of work, but as I said, they all have configs and env variables are convenience overrides most of the time.

> Check out the list above on Dockerhub.

Will take a look, thanks. Wanted to learn docker in depth for a long time, but had no notable project to force me to use it. Maybe someday.

[0]: https://news.ycombinator.com/item?id=26660409


> I've started with a C64, please. :)

Personal experience in computing is not what I meant. I only realized that I'm not intimately familiar of the dawn of the UNIX daemon and how their configuration methods changed with time, only the echos of this in daily Linux use. Thus, I realized I was possibly extrapolating and assuming things.

> Yes, the tools I've talked about are userspace programs, and are not daemons 99.999% of the time. So you need to run it many times with small differences, and reconfiguring/regenerating file is a lot of work.

Yeah, and I think this lack of distinction is what poisons the discussion surrounding this post - these are separate worlds with different requirements, conflated into a single argument or point of view.

tl;dr I stand by my point with preferring anything over environment variables for services (especially complex ones), but I also fully agree with your usecase for interactive, CLI-driven systems. I mean, one of my favourite programming language features in recent years is that I can cross-compile Go programs just by setting two env vars: GOARCH and GOOS :).


I'd be absolutely horrified if a service that I use need a specific environment variable set in a particular way to work correctly and it's not well documented. That service would get bonus points for inability to configure that particular option in a configuration file.

I personally would never add environment variables to a program I write which may run as a service.

Oh, Java is calling, hold on... :)


Sure, but everyone understands that option and knows the pitfalls. This is kind of the core of the “worse is better” mindset, which somehow seems to have disappeared from the collective consciousness even though Unix is bigger than ever.


> Attacking something so well established because of personal reasons feels so wrong from my PoV.

That is a mischaracterization of the post. The author is making a clear technical point about how environmental variables are global mutable state. Labeling that as an "attack because of personal reasons" is just plain misleading.

> That thing wouldn't be a de-facto standard if it was too bad, right?

How much of the post did you read? Your point is almost exactly the same as the 3rd listed in the post:

> It's the same old trifecta of why things are bad and broken:

> 1. Envvars are easy to add

> 2. There are existing processes that only work via envvars

> 3. "This is the way we have always done it so it must be correct!"


> That is a mischaracterization of the post.

I don't think so. First of all, as I detailed in [0] and [1], my central point of disagreement is the tone and attitude of the post, not the usage of environment variables itself.

There are a lot of scenarios where environment variables makes a lot of sense, and scenarios where using them is absolute madness as we discussed with q3k in [1].

> How much of the post did you read?

All of it. BTW, please remember asking this question is directly against guidelines [2] (sec: In comments, guideline 8).

> Your point is almost exactly the same as the 3rd listed in the post: 3. "This is the way we have always done it so it must be correct!"

As I said in my other comments, I do not directly support the exact opposite of the author's stance. My disagreement is in the tone and rigidity of viewpoint. To quote myself:

I'm not calling this is good with the persistence of the original author. I tell that it's one of the realities that we have, and instead of burning it with torches, why not build better conventions around it with better attitude and language?

Please see [0] and [1] for further clarification.

[0]: https://news.ycombinator.com/item?id=26660409

[1]: https://news.ycombinator.com/item?id=26660553

[2]: https://news.ycombinator.com/newsguidelines.html


> These methods are being used for decades and well understood with all their advantages and disadvantages.

Just because something has been used for decades does not make it good - for example, avoidable mutable state. And it certainly does not make it well-understood - as a consultant the number of brain frying environment variable configurations I've had to deal with which no permies could tell me anything about defies belief.

> someone comes and tells that it's bad and considered harmful,

Yes, some things are bad and are actively harmful. Famously, unstructured programming using gotos. Would you like to go back to that? Believe me, you would not. But perhaps you are not a programmer?

> That thing wouldn't be a de-facto standard if it was too bad, right?

It's not a "de-facto standard", it's simply bad.


> Just because something has been used for decades does not make it good.

I'm not calling this is good with the persistence of the original author. I tell that it's one of the realities that we have, and instead of burning it with torches, why not build better conventions around it with better attitude and language?

Maybe we can try: "Instead of burying all config under environment variables, why not try doing it like this?", and slowly build something better, step by step. Nothing is inherently good or bad, but can be abused. So the abuse of environment variables as a shortcut needs to stop, one may say and I'd agree, and may also volunteer to help to build a better thing.

But, shunning it with anger and shouting "I'm the one who knows all right things!" sure creates backlash, like here.

All in all, I'm against the attitude, not the idea of improving a situation.

> Yes, some things are bad and are actively harmful.

It might be, but even your solution might not be right. Why the attitude?

> unstructured programming using gotos. Would you like to go back to that?

Did that on some older, limited hardware, and it was fun. It was not OK by today's standards, but I had to. I'll do it again if it's the only thing I can do to work on that particular hardware again.

> But perhaps you are not a programmer?

I just design algorithms and develop scientific applications which run on HPC clusters, nothing fancy.

> It's not a "de-facto standard", it's simply bad.

I didn't say it's good. I say it's a fact. I'm not disagreeing on its bad sides. I'm not OK with the attitude.


> "Instead of burying all config under environment variables, why not try doing it like this?", and slowly build something better, step by step

Use named files, that can be version controlled and documented, but are under direct access from the actual program and can be reported as an error if (for example) they are not found.


Actually, this is how I do:

    1. Make the thing completely configurable with a file.
    2. Add an optional switch to select the location of the config file.
    3. Always ship a well commented, sensible config file with the application. Document environment variables there, if any.
    4. If it makes sense, add the ability generate a default config file if it's missing.
    5. Always add a logger, with good debug and error output (Dovecot is my inspiration and role model there).
    6. Document everything in the code, and in the documentation if possible (external documentation is not my strong part yet. I can't write it fast enough).


> But, shunning it with anger and shouting "I'm the one who knows all right things!" sure creates backlash, like here.

I think this is relative. I did read the post as well, and I didn't find the author with such a "negative" attitude (but then again, I'm not American).


Agree. Also I don't see any proposal put forth. Did I miss it? The gist I got was "env vars are bad and if we don't use something else (what?) we're hung up in the past".


Easy fix (if it's the shared, mutable state which bugs you):

* Create one class responsible for ingesting env vars at startup.

* Call it from main, and abort early with nice messages if it fails to read something.

* Now you have a nice (preferably immutable) class which guarantees the config is in a 'good state', and is self-documenting because it lists all the keys it uses to lookup env vars with.


This is essentially what I do in Rails apps. The only reference to an env var is in an initializer that sets an option in the global rails config structure.


The mutable state can be helpful. It is sometimes helpful to be able to change an app’s config without having to restart it. Ingesting the envs on startup into a class removes this ability.


Please, never do that on a server.

It's ok for interactive applications, but if you are writing a CLI command (what is different from an interactive CLI application), a system library or a deamon, don't ever let the same application that uses a configuration also change it.

When your non-interactive programs do that and anything at all goes wrong, it's basically impossible to determine the source of the problem. Also, it is common that bugs that one could just avoid triggering by configuration now become unavoidable.

(But if you mean reload the config after getting a SIGHUP or something like that, yeah, this is ok, and the best way to do that is by restarting everything on your program, even if you keep the same process, so your read-once class won't be a problem.)


thats quite an antipattern in production.

Immutable config via a config class that can exit early (prefereably startup) if there is a misconfiguration


If The same pattern works well in python at the module level, if your application is setup as a package. A module config.py sets a bunch of python variables like

    import os
    
    ENV_VAR=os.environ.get('ENV_VAR', default_value)
then the rest of the application can grab configuration with

    from .config import ENV_VAR
Since the assignment code executes on import, all config is read in when any piece of it is first used, consistency checks and logging can be written into the config.py module as normal python statements, config values can be cast to appropriate types (raising exceptions if they fail), etc.


In node.js there are quite a few packages that do exactly this - it's a great pattern and forces you to define which env values you will rely on in one place, rather than dripping them all over your codebase.


This is more or less the way I handle configuration in most applications that I create. +1


Agreed - this solution works well, and works nicely with statically typed languages.


Oh this guy again. This guy created Meson and has a pattern of being 1) Quite toxic and 2) Entirely dogmatic when it comes to software design. He shows no interest in discussing design problems with Meson and asserts his viewpoints as truth and fact, resorting to snide comebacks instead of having thoughtful conversation.

Doesn't surprise me he wrote an article like this. Completely misguided and isn't rooted in reality.


Okay, so when you write that this blog author, who made a post arguing how environmental variables are global mutable state, is quite toxic and resorting to snide comebacks instead of having thoughtful conversation, then that is just you engaging in thoughtful conversation about the issue (which is envvars), and not you making a toxic ad-hominem attack at all, right?


GP is just one voice in this discussion, where others have already addressed the substance of the article. Some context and history is valuable.


I would say at least the comments about dogma and tone are relevant in the current context.


Yep.


It surprises me that people put more attention to the author than to the content. I have no idea who the author is, but his post doesn't seem to me toxic at all nor dogmatic.

On the other hand, your comment sounds a bit toxic, to be honest: "because the author is X it must be that all of his articles are X as well".


This post seems to ramble without much substance.

The best argument, perhaps only valid argument, is lack of an array type. Easy to work around. The rest seems misguided or ridiculous.

>There is no way to know which one of these is the correct form

What? Of course there is.

The author is also calling it mutable global state, and seems to reference an application becoming confused when ENV isn't set. This reads to me, perhaps incorrectly, that the author doesn't understand that envvars aren't and don't behave like shared global variables. That is, changing one won't affect running applications.


To be fair, environment variables are mutable global state across a single shell session.


Howso? A running process won't affect your current env, and changing your current env doesn't affect the running process.


I don't understand how a blog post like this can garner, at time of writing, a hundred upvotes. Normally these types of strong statements "Don't do X/Don't use Y" are attention seeking titles with inversely proportional interestingness of its content. That's the first red flag. The second red flag is that the article is not proof-read: from just the first few minutes of skimming, I found two errors that make the text jarring to read (Persistance->Persistence & [you] can not now -> [you] can not know).

Thirdly, the entire central point makes no sense. The author presents this argument to illustrate why environment variables are confusing:

> The environment is now different. What should the program do? Use the old configuration that had the env var set or the new one where it is not set? Error out? Try to silently merge the different options into one? Something else?

The answer is obvious to anyone that knows what environment variables are and how they work: if the variable is set, it should use it, and if it's not, it shouldn't.

The author goes on (again, spelling error) to state that:

> For comparison using JSON configuration files this entire class os (sic) problems would not exist.

What is the practical difference between using a JSON formatted text file containing settings, and a text file containing environment variables and their definitions? For the situation we're discussing here, the answer is frustratingly simple: there is none. This post is a waste of time.


Yes. At the same time, the post missed an appropriate case for avoiding environment variables: when you want dynamic configuration for your services, where configuration values can be changed at runtime without requiring a full redeploy of all the instances.

Of course, that brings its own set of complexity which should be carefully weighed against requirements.

Environment variables are still one of the simplest—and yes, most deterministic ways—to alter behavior of a program.


Such a shame, Enviroment variables are indeed difficult to work with sometimes, who sets them? in which file? Who can/will override them? typo's are also not being caught because editors dont have lists of possible variable and/or what value they are allowed to contain. you might also lose them on different containers.

Enviroment variables should be part of cgroups in some way. I dont like that any program can modify the PATH variable as example. seems like a recipe for disaster in privilege escalation.


> who sets them? in which file? Who can/will override them?

The ops team. Environment variables are a great way of separating operational concerns from business logic. Environment variables are great because your application is agnostic about how the configuration is sourced. Let the ops/infra team handle that.

> I dont like that any program can modify the PATH variable as example.

And ... why not? Child processes can't modify the environment of parent processes. Environment variables flow downwards.


Author doesn't know about https://12factor.net and as another commenter mentioned, probably hasn't deployed something to a 'production' environment (or rather, doesn't know about separation of such environments in the first place).


To be fair, "configuration" means a somewhat different thing when we're talking about a user application vs a server. It's easy to forget on HN that some software engineers don't write web servers at all.

Having used the profiling tool TAU (subtle dig), I instantly understood what the author is driving at, and I somewhat agree for many use cases. I shouldn't have to fill my dotfiles with 10 new variables just to use a utility.


I don't know about https://12factor.net either.

But I was assigned a task last year to remove configuration from environment variables (for security reasons). I deployed my work to 'production'.


Linux is usually configured to not allow processes from another user to read /proc/$pid/environ. At least a production machine should be.

Configuration files are resistant to this as you note, but command-line arguments are not (--password=1234 will show up in ps for everyone).


What security concerns did that alleviate?


Any third party code in our system can just read whatever's in the environment and POST it to some remote server.


Avoiding environment variables reduces the risk but doesn't eliminate it. The secrets still live in memory in some form, correct? However, it does help to eliminate generic attempts to exfiltrate environment variables.

Tight control of egress network traffic is better but more difficult to implement.


Any third party code can just read your credentials file and POST it to remote server.


Bold of you to assume my third party code runs with the same UID and SELinux label as my credentials-handling code.

(I wish, it's April 1 after all!)


If the third party code runs with a different UID, then it can't read the environment either.


Unless it has DAC override or other capabilities. Belt and braces!


If it has DAC override, then it can read your credentials file just as easily as it can the environment.


Not if SELinux policy prevents it.


File permissions allow finer granularity of access control. Environment variables are visible to any user in the system.


Not in any multi-user multi-process OS. You set environment variables in a process (ie. shell/CMD.EXE) and spawn child process (the program) from that parent. The environment variables will only be visible to those two processes.


Linux disagrees; try

    strings /proc/*/environ
to see for yourself.

On Solaris/SunOS, you could use `pargs -e $PID`. And so on.

Having separate UIDs to run your processes A and B under shields either one from peeking at the other's environment, though. UNIX DAC is simple and powerful enough for MOST security concerns, I would argue.


> Environment variables are visible to any user in the system.

This is completely false in any modern OS. You can only see environment variables of your own processes.


Unset them after right after evaluation.


That's not where the credentials are stored.


Well, sure, you shouldn't be putting secrets or other sensitive data in environment variables. But garden-variety configuration is fine to put in env vars. Seems like whoever assigned you this task didn't really know what they were doing.


Oops, I've been putting secrets in environment variables since I can remember. Your comment piqued my curiousity on why this is a bad idea.

Found this:

https://diogomonica.com/2017/03/27/why-you-shouldnt-use-env-...

https://security.stackexchange.com/questions/197784/is-it-un...


It looks like the author is talking about command line tools that use env vars for things that should be arguments. In the comments on the page he admits that for example key credentials are valid usages for env vars.


Why the trolling? He has valid points. It doesn't matter whether he knows about this methodology or not. Does everything looks like a nail to you?


> Author [...] probably hasn't deployed something to a 'production' environment

I think you're wrong about that. The blog post author is also author of Meson, the build system.


If you don't mind, I'll keep doing the relatively sane thing: using env variable at the startup of my applications (and, as much as possible, never anywhere else), among other configuration sources (like text files), to create a struct / object / dict / whatever that represents the configuration, and that the rest of my code uses.

If you see someone using `os.env["xxx"]` as an escape hatch for a mutable global variable, then, yes, it's probably not a good idea. But it's not configuration anymore, it's runtime state.

(Although I suppose no one is going to hit HN front page by writing an article titled "Don't use global mutable state", except to make game programmers giggle ?)


"Environment variables is exactly this: mutable global state." No it isn't. Every time you start a process, it gets a set of environment variables of its own, which won't be changed by any further changes in the parent process. This is the opposite of how global variables work and is exactly how function arguments work.

The rest of the article isn't very good either. The examples of running a program with two different states and of trying to do nested escaping would both apply to any means of passing configuration.

I also find it hilarious that the author suggests using JSON configuration files instead, which actually have many of the problems that this article falsely claims that environment variables have.


> which won't be changed by any further changes in the parent process.

You can attach to a process and call setenv in gdb, so there is a loophole somewhere.

I'm not encouraging this, just pointing out that having a nice binary which reads config, cli, and environment once is still required. Regardless of how you pass that information into your binary.


The title should be "don't sprinkle environment variable reads all over your codebase", not "don't use environment variables".

The problems proposed are easily fixed by:

1. read all your env vars at startup in a single function, then pass them down from there

2. don't invent your own serialization format, just use json or csv, both work fine in env vars. or use the env var to reference a file path that contains more complex values.

As a devops engineer / sysadmin for going on 10 years now, I pretty strongly disagree with this article. Environment variables are so much better than the alternative.

In the past, programs frequently invented their own configuration loading systems, but over the last few years containerization has strongly nudged most programs towards accepting env vars. The result has been a better, more consistent, and less surprising config experience for everyone, even non-container users.


Mixed feelings about this. I strongly agree with some, but also feel it's missing the point in a lot of places.

Environment vars should of course not be used to configure specific programs. Having an environment variable to specify the args with which to call a program is needless complication. Just pass it as an argument. Use a configuration file to configure the program.

Environment variables should (only) be used to describe the environment. That should primarily be variables that transcend individual programs. Things like proxy settings are perfect for environment variables. (Well, almost; see below.) Perhaps the location of the configuration file, if that can vary per system (which it probably should be able to; hard-coded locations can also be a problem).

But even then, environment variables can fail. I noticed that some Azure/Kubernetes-related commands on my work Macbook need to run with the proxy on, and others with the proxy off, so I created aliases to enable/disable this environment variable, which completely defeats the purpose of the environment variable. Maybe I should be able to configure this proxy per application after all. Or at least configure whether to use or ignore the proxy settings. And then there are applications that ignore the proxy env var for whatever reason, and require me to configure it specifically for that one program, again defeating the purpose of environment variables (I think npm does this).

But when we deploy our application to different environments, our deployment configuration does set specific env vars so the application knows how to behave in that environment. It's what environment variables are for. But they're not a great fit. For example, we're currently in the process of migrating from AWS to Azure, and some things need to be enabled or disabled there. So we set some environment variable to 'false', except that environment variables are always strings, and in javascript, 'false' evaluates to true. Json configuration might actually make more sense here.

So I don't think we can or should do without environment variables, but I think I agree they're overused, and often used badly.


Yeah, we have a service that needs to use a proxy for most calls, but one component mustn't use the proxy, in a way that iirc is not properly captured by a `NO_PROXY` entry. Now the abstraction breaks down and the service has some ugly special-case code. :(


The old guru I learned Linux from always said: "The bugs I've spent the longest time tracing down have always been caused by environment variables. It was only when I started checking them first, rather than last, that I felt competent as a sysadmin."

Truth is, you aren't going to get rid of them unless you make an operating system that doesn't have them (and good luck porting anything useful to it). This is poor advice, because it amounts to FUD -- those problems won't go away, but in telling people to avoid environment variables will quash their curiosity about them. This teaches people to only look to the environment last.


I've endlessly debated with myself about this and have come to the conclusion there isn't really a good solution and the best choice is probably put as much in an actual file as possible. environment is sort of leaky and non-obvious . at least with a file there is something that is written somewhere that can be inspected and passed as an argument or as an environment variable. The only real positive I see with environment is that children of the process group basically get it for free, but that can be a negative in it of itself as well.

If I have to pick between environment and arguments as configuration, I'd probably prefer arguments since the application would have to explicitly iterate over all the arguments and handle them in some manner, like assign them to some structure or global internal to the program.


This flies in the face of pretty much every opinion I've heard from experienced developers in the past 5 years. Once someone said, "You should be using ENV for configuration" I started doing it, and I found it to be a better solution than I previously had. I am also open to dedicated config files, whether they set ENV vars or not.

I'm open to the idea that ENV is not the only way, and I certainly believe that there are situations where other solutions are warranted, but my opinion right now is that this is wrong, and I perceive this also to be the prevailing opinion in our industry.


Yes. If you package software up for deployment on some server (i.e. you use something like Docker), environment variables are the easiest/only configuration mechanism at your disposal. Packages that don't support this, need some workarounds (e.g. dockerize to template some config file using environment variables) to be packaged up; which is annoying and extra work. Decent server software comes prepackaged in docker form these days. Which means environment variables are the way you control those unless you want to force your users to create their own docker containers just so they can fiddle with config files, which is a bit user hostile.

Dedicated configuration files only make sense if you assume a writable file system is there. Which is a broken assumption on many containerized environments. There is a lot of legacy software that works that way of course. Some software allows doing configuration via config files and then allows overriding keys in those files with some naming convention via environment variables. That's a good compromise since that allows you to package up sane defaults that you override as needed via the environment. It does not have to be an either or type thing.

Another common pattern is to allow overriding configuration via commandline arguments, which you can then gather in an environment variable and inject via docker. I do this a lot with JVM based software where we have a lot of -D options to override specific configuration value defaults via docker. Less clean than just having dedicated environment variables in the docker file but it works.


> Dedicated configuration files only make sense if you assume a writable file system is there.

I disagree. It's pretty typical in containerized environments (in my experience) to pass in (eg. mount) a config generated by whatever configuration management system into a container for it to load its configuration from.

This has the following advantages over env var configs:

- support for more complex expressions than string -> string maps (eg., configuring an IP blocklist)

- less chance of mistakes stemming from typos (eg., 'FOO_LODGIR' instead of a 'FOO_LOGDIR' in an environment variable will likely be silently ignored by a service, while a 'lodgir' key in a config file will cause an error in most serious config parsers that I've seen)

- working against a schema - if you use something like openapi, json-schema or protobuf/prototext to define your config format, you can use this schema to check/generate the config from other code, and even use it as an automatic source of documentation for the configuration format

- hot reloads of configs - once started with env vars, the env vars cannot be (easily) changed, while files can easily be changed (on mutable FS, or from an external source like ConfigMaps/Secrets in k8s), watched and reloaded from, or even used as a signal that the software should restart


Writable file systems are needed for things that are stateful but inappropriate otherwise.


Docker certainly supports mounting a configuration file on the host system. Might be a little messier but it’s not impossible.


IMO, the author is writing from the perspective of Meson, a command line build tool used by individuals that takes arguments and caches them in a per-project file, whereas most of the negative replies are commenting from the perspective of sysadmins deploying software into homogeneous servers or Docker containers. Would make be a better program if MAKEFLAGS was not an environment variable? (IDK.) Would Git be a better program if the project directory was passed as an argument rather than as inherited state (the cwd)? (IDK.) Would less or nano be a better program if file paths were passed in as environment variables rather than arguments? (no.)


> Would Git be a better program if the project directory was passed as an argument rather than as inherited state (the cwd)? (IDK.)

Well, it can do both :) `--git-dir=<path>` will let you run git for another project directory


> a command line build tool used by individuals that takes arguments and caches them in a per-project file

That is literally one of the worst ideas I’ve ever heard of. And I’ve heard many bad ideas recently.


What's the alternative? That's how make, cmake, msbuild, ninja,docker, and basically every other build tool I'm aware of works


CMake does that and is a truly demented implementation, mixing user-specified initial state (supplied the first time you run CMake and carried on to future argument-less invocations), derived state, and cached compiler locations and versions in a single CMakeCache.txt file. Edit CMakeLists.txt? Time to delete CMakeCache.txt! Upgrade your compiler? Time to delete CMakeCache.txt!

I haven't used Meson all that much, but I recall it's a bit better than CMake but I still ran into a similar issue at one point.


You take the arguments every time? Or you read them from a configuration file.

But you do not automatically add them to the configuration file unless the user explicitly tells you to do so.


Docker doesn't do that. You can create a .env file, but it only reads the options you give it, Docker never writes/caches to that file.


Make doesn't work like that.


Wait, make has a global state? I was under the impression most make targets are stored somewhere in the project itself (like a build/ directory). I think this is a fine approach, BTW. I think global state should be avoided unless necessary.


Make does what you tell it to do. While some projects have the setup that you've described, many others do not. Make will only check the timestamps of the generated assets and decide to build (or not) based on that.


I could agree with "don't only use environment variables for configuration", but at the same time, by all means do use environment variables to allow for easy overriding of configuration.

I cannot but think of the large number of times that being able to quickly override some parameter with an env var has helped me achieve something which was not exactly intended by the original author.

Also, env vars are the most common way to override configuration of software that has been dockerized (prepared to run in a Docker container). Plus, containers remember their initial environment, so there is no issue of running afterwards without the env var.


Changing a variable of an existing container without recreating or re-running the docker run command is tricky, though. With a configuration file mapped to an external folder it's "just" modifying the file and docker restart, with environment variables it's somewhat more convoluted. I don't think it is as clear-cut as the title suggest, each kind of configuration has its place.


Yes, that was my point actually, although I might have expressed it poorly:

The article says that one problem of using env vars is that you might set it for one run, but then forget to set it for the next one. This could be problematic if the program is stateful. So I wanted to make the passing comment, while talking about containers, that a container will "remember" the initially set env vars, so if you stop and start the process, the variables would persist across executions.


Indeed, that's actually an awesome feature because guarantees the container will work as it was running upon a restart.

Problem is when you have an environment variable you want to change for some reason without triggering a redeployment/release/whatever, with a configuration file in a volume you can change it and restart (of course this breaks the premise of "restart keeps full state", but I can live with file modifications, with docker at least you can keep them at least "triggered externally"). With an environment variable the only option I have found is stopping docker, digging into the container configuration, changing said variable and starting docker again (lovingly called "Indy swap" when we've had to do it, luckily it's less than once a year).


You don't need to modify the Dockerfile to change environment variables at runtime.

Also if you specifically want to modify a file and re-run the container look at --env-file arg to docker run.


That's not what I'm saying, what you mention can be done indeed.

What I say is that if you are in a machine with an already running docker container, one you don't have the corresponding `run` command of (i.e. you _can't docker run_ for whatever reason), changing the environment variables of that container is tricky.


You could always run the `env` command inside of the container to get all of the set environment variables. Then you can define, reconstruct and run your container with whatever args you want.

But yeah the env_file approach is the way to go here. I've been using Docker in production since 2015 and never ran into your use case. I always had an .env file ready to go that was loaded in with env_file and always had control over being able to run the container with whatever command I see fit.

And in cases where I have no control over how things are run (like Heroku), environment variables still work because a ton of hosting platforms expect you to set and read them for various configuration.

And you can also commit an example env file to git with no secrets so developers can `cp .env.example .env` to get going in 1 second.

The environment variable pattern is incredibly standard in the world of web dev.


I have been using Docker in production since 2015 as well, and I've had to do this more than once. To each their own.


> Environment variables is exactly this: mutable global state. Envvars have some legitimate usages (such as enabling debug logging) but they should never, ever be used for configuring core functionality of programs

Doesn’t matter where configuration lives, the name itself suggests that it is by definition global mutable state, I mean that’s the whole point of it.

And it doesn’t matter if it’s used to configure core functionality (e.g feature toggles) or secondary functionality. The whole purpose is to do exactly that :)


Maybe I live under a rock, but I feel like command line utilities taking environment variable arguments just isn’t that common of a practice? The only examples I can think of are when an argument is more of a “global variable” that many different commands may find useful such as JAVA_HOME, GITHUB_TOKEN, etc. For web applications and docker containers, environment variables fit easily into the deployment process and for all intents and purposes, are pretty much static.


>Maybe I live under a rock, but I feel like command line utilities taking environment variable arguments just isn’t that common of a practice?

Nope, it's quite common. Many standard UNIX cli utilities take some environment variables (including things like LOCALE). Also common in scripts, setting up programming languages paths, etc.

It's also not a bad practice, contrary to what the author says.


LS_COLORS and GREP_COLORS are weirdly-specific... but arguably aren't!


Examples I've used in the past few days, from memory - along the lines of:

LC_ALL=C foo

PAGER=cat man ls

TZ=UTC date


These are good examples but I think these fit under my “global argument” umbrella. Imagine if every Linux command had a different argument name for the pager?


One perspective from this thread: environment variables end up denoting different levels of persistence in interactive, server, and containerized applications. For interactive apps, environment variable configs tend to sprout up for _persistent_ configurations e.g. HOMEBREW_NO_AUTO_UPDATE - we want the behavior to change, but we don't want to pass a flag to make that change every time.

For server-like applications, env vars denote a _transient_ change in behavior e.g. FLASK_DEBUG=1 python -m flask ... to turn on different behavior in that instance of the application. Persistent configuration changes go to a config file or similar.

For containerized applications, env vars are back to denoting _persistent_ changes in configuration since we bake the values in to deployed containers via whatever orchestrator.

TFA seems to be assailing the first perspective, which is actually reasonable. Sticking secrets and configuration into the shell environment for each tool is not great. Transient config via args and persistent config via files makes lots of sense.

Even in that setting though, there is probably a role for environment variables when spooky-action-at-a-distance changes are required in sub (sub- (sub- ...)) processes / libraries where passing configuration through each caller would be a pain.


I hope I don't get downvoted, but I think the OP has a point. Command line arguments will do, and perhaps even be better.

Why are there so many harsh comments about this post?


In typical Linux setups, cmdline is world-readable, but environ is not. So you never should put secrets in cmdline, but they are ok to be in environ. And that is pretty much the only difference between the two.


The environment is inherited by child processes. I think that is a very important difference.


Of the relevant Linux syscalls neither fork or clone touch argv or environ at all, on the other hand execve requires passing both argv and envp explicitly


Unfortunately many people here have a bad attitude thinking they know everything. His point is completely valid. I myself always disliked this kinda recently use of environment variables for configuration – that likely has the 12-factor methodology to blame for its acceptance.


Command line parameters should be used for things that change frequently, not for configuration variables. You don't want to pass 20 command line parameters every time you invoke a program. You rather want to set your environment variables one time in your shell initialization script (.bashrc, .zshrc, whatever).

You can say there are the configuration files, and sure a complex program should have its configuration file. The problem of configuration files is that they all use a different syntax that you have to learn, you have to write them, back them up, copy around, etc.

Also you don't have a simple way of overriding a configuration file parameter for one invocation of the program, that isn't modifying the configuration file. With environment variables is simple: set them before executing the program.

Last thing is that environment variables aren't specific to a particular program: they are readable by every program executed in that shell. A configuration file cannot be easily shared, since one program can change its format to make it no longer backwards compatible, so you can't rely on reading other program configuration files. While you can with environment variables.


I agree. I also prefer command line arguments (or even config files) instead of environment variables, in all three cases: when developing software that others run, running software that others have developed, and running software I've developed.

In addition to what's said in the post, environment variables have one more significant, practical shortcoming: if an application 'foo' takes a FOO_LOGDIR (which defaults to /var/log/foo if unset) and you accidentally slip it a FOO_LODGIR=/run/log/foo - it will silently accept and ignore it, while you will be scratching your head why is the logging not behaving as expected, until you discover the typo.

Naturally, 'foo' could ensure no FOO_* with unknown names are set. But I've yet to see an application like this in practice, while nearly every single flag parsing library out there immediately errors out on unknown flags.


Env vars are a built-in for key-value command line args. Why introduce some custom convoluted way into your app for kv args instead of this standardised approach.

I can understand why some heavily used cli app would support a custom parameter pattern, eg. for ergonomics, but for majority of apps running as deployed services and not cli apps invoked 100 times a day, I don't see a reason to not use env args.

I can run the following and introduce exactly zero parsing boilerplate into my app or invoking scripts.

    port=12345 color=green ./myapp.py
Also - some apps have to be configured from file, others on the command line. Which parameter parsing library even supports that? Env vars support this out of the box.


Environ isn't actually key-value, it's only convention that applications (/libraries) parse it in the name=value form. But the underlying mechanism is similar to cmdline: array of pointers to strings. You could parse cmdline same way if you'd want.


It's an exceedingly strong convention. I'd be genuinely curious to learn of non-malways uses for entries within the environment block that don't fit the standard name=value pattern.


Don't POSIX and portable C both require key=value? If so, isn't that more than just convention?


I don't know about POSIX, but C doesn't mandate an implementation. The C Standard says about getenv():

Quote:

The getenv function searches an environment list, provided by the host environment, for a string that matches the string pointed to by name. The set of environment names and the method for altering the environment list are implementation-defined.

The implementation shall behave as if no library function calls the getenv function.

Returns

The getenv function returns a pointer to a string associated with the matched list member. The string pointed to shall not be modified by the program, but may be overwritten by a subsequent call to the getenv function. If the specified name cannot be found, a null pointer is returned.

End quote. The takeaway from me is that yes, environments do define a key/value store of some sort, but how they're implemented isn't stated. It can be an array of strings in "key=value" format stored in RAM, but it could just as well be a hash table stored in ROM.


Also, the specification of getenv does not exclude of having other data in environment that is just not retrievable by getenv but would be accessible e.g. through the third argument to main that is mentioned in the common extensions section.


Righto, I was considering Linux only, I don't personally care much for POSIX. I don't think ISO C standard sets any strong requirements. But yes, if you want maximum portability then you probably should be pretty conservative with environment.


There are a lot of use cases. For example, I have 17 applications running on a system that need a connection string. When I need to change that connection string, I can 1) change 17 config files; 2) change 17 shortcuts containing command-line arguments; 3) change one env variable.


> I can 1) change 17 config files; 2) change 17 shortcuts containing command-line arguments; 3) change one env variable.

This seems like forcing the application to take the burden of your configuration system having shortcomings (like not being able to freely convert from a single source of truth into multiple generated files/command lines/...).


Or change ONE file that contains the connection string.


That's literally what they're for. Take for instance:

DATABASE_URL is a fail-safe method of configuring. When you install the app in dev, the app can _only_ point to the dev database. If the variable is absent, the application fails to start. If the prod URL is accidentally put in the the dev environment, firewall rules would prevent the prod connection.

Configuration is _best_ put into the environment, not the application itself.


> For example suppose you run a command line program that has some sort of a persistent state.

`$ SOME_ENVVAR=... some_command <args>`

> Then some time after that you run it again:

`$ some_command <args>`

This is absolutely no different from the following:

> For example suppose you run a command line program that has some sort of a persistent state. `$ some_command --some-arg=value <args>`

> Then some time after that you run it again: `$ some_command <args>`


If not env var then what is a good alternative? Config files could be considered "mutable global state".

I have no issue with env vars for read only configuration. Programs that set env vars... thats where it gets very ugly.


Pydantic has a nice way to parse/validate a collection of environment variables into native python types using type hints [0]. It solves for some of the pain points the author mentioned in a clean way.

[0] https://pydantic-docs.helpmanual.io/usage/settings/


I've always used a mix of environment variables and a set of config files for my apps. Often the vars determine which config files are loaded, which mode the app operates in (expert,normal,beginner), and debug level. I never put security information in the env variables. I also always treat vars as unvalidated user input (i.e., apply sanitization and validation).


I think these disadvantages can be mitigated by putting the code that fetches environment variables close to the entry point and pushing the values down as arguments. I’d also say that a lot of these things are problems with configuration in general and one should seek to minimize unnecessary configuration to avoid a lot of the associated headaches.


This is such a bad take that it feels like it being in the frontpage might even be dangerous.


No, they aren't mutable global state, they are paramatrers which is something many languages support: the key difference by design is that their configuration only lasts for the dynamic extent of the call.

An environment variable is not mutable global state, for if a process change one, then that change is inherited by it's children, but not by it's parent or siblings, and the change effectively stops existing once the process ends. — that's a very important design change that removes all of the problem with mutable global state.

In that sense, they are more like thread local variables that are inherited by child threads.

A configuration file is much more akin to global mutable state of a system.


Clearly we shouldn't use environment variables for everything - I agree with the do-it-in-layers comments of others - but I was compelled to point out he's got it exactly backwards. He suggests that environment variables are mutable global state, when they're not, then suggests actually mutable global configuration files. Surely in his scheme about adding numbers this is equivalent to setting the numbers to add in an entirely different source file!

Environment variables, on the other hand, are more like lexical scoping - you can shadow them, spawn a new shell with copies of them, override them for a single invocation and then have them go back, etc.


I've recently started developing unix tools with config passed entirely as arguments with a special argument called `--args-file` which takes a file or stdin and reads one arg per line.

This is of course nothing new, but such a powerful pattern:

- It lets you run things without the need for a config file (when testing/running interactively)

- but on the other hand, you can still define arguments saved in a file for more permanent setups.

- `--help` ends up providing all the doc needed to configure the tool.

- It's less magical than env vars and doesn't leak into subprocesses by mistake.

One pragmatic step further is to expand env vars using the shell `${VAR}` syntax for slightly more flexibility.


I wish there was wider adoption of configuration services. Something like Enovy xDS protocol, but for general configurations.

Puppet & Co did it to some extent, but for the whole server. On boot it asks centralized service for a configuration, providing just server identity and some basic attributes (datacenter, rack, stage) and receives full server configuration. Implementing this pattern for a service itself, would make it easier to configure swarms of services.

I experimented with using Open Policy Agent, where it dispatches exact configuration based on client identity and quite liked the result. Only downside is that it requires polling.


It indeed seems like there are a lot of fundamental misconceptions in this article:

> For comparison using JSON configuration files this entire class os problems would not exist.

This is the trade-of between structure and embedding. Writing '{"ARG": "-Dfoo=\"bar bar\" -Dbaz"}' would satisfy the authors need for JSON but not really do any difference.

The initial example is also wrong. There certainly are example where one would use variables defined in out scope.

  f v = f' 0
    where
      f' n = v n


Good point, not the best arguments. But somebody needs to say this.

Configuration in environment variables are suitable for short options for commands that needs to be inherited to sub-commands. Great for things like LESS and http_proxy.

Suddenly people start shoving all kinds of configuration for a specific instance of a software in environment variables. Often with an argument about how otherwise it's not "twelve factor". That's not great. I mean, it's nice and all, but it's quite literally a blog post from a guy on the Internet, not an argument in itself.

Arguments against putting all configuration for a software instance in environment variables is that it's not suitable for non-ASCII and multiline data for a multitude of practical reasons, the storage space is limited, the actual size will vary for operating systems and overflowing this will not be obvious, and the fact that child processes will inherit this data. If there are keys and other secrets involved, child processes will receive a copy of this.

In comparison, storing configuration in a file will have a much more well defined format, it can be written and copied just like any other piece of data, and the standard tools will control access to it. Things like AppArmor can limit access further to the single process.

Configuration files has been used forever for a reason, it's a reasonable default choice. Environment variables should be used where they are suitable, for interactive tools and for globally shared settings.


Never trust an advice expressed in an absolute.


Well, infer that it's not absolute but rather ironically depends on the kind of environment you're developing software for. <:)


What I mean is that when an advice is expressed in absolute it tends to be a radical view on the problem.

Radicals tend to dismiss things that don't agree with their world view and I am always wary of this. And you should to. I don't mean radicals can't be right. What I mean is you should be cautious about it.


I feel like this is one of those riddles with two doors, two guards, one always tells the truth, the other always lies.


Environment variables, when used correctly, tend to work just about everywhere. That's why they're in use. It's the same reason why programs still install to `/usr/bin`, even though we're all very much aware that `/usr` is not "user direectories" and the whole idea of "userspace" vs "kernel space" is irrelevant in 99% of *nix installs these days (including containers).


Configuration should be parseable. Unix philosophy has been to use configuration files for anything much more complex than flags, which seems reasonable to me. Don't try to force structured configuration into an arrays of strings.

Traditionally, I see flags as being very application-specific and environment variables being very generic. Environment variables control the behavior of shared libraries and the interaction with the operating system. That separation frees applications from worrying about name collisions between application and library configuration.

The tradeoff Unix made between unstructured configuration (e.g. main(void *arg)) and fully structured configuration was probably wise. Arrays of strings with quoting rules are pretty flexible and human readable. It doesn't gives the full flexibility of the extreme approaches but has aged well.

For example, anyone who is annoyed by environment variables is free to wrap every program they care about in a shell script that sets the environment from the flags they want. Anyone who hates flags can pass environment variables to their version of the script with a bunch of --flag=${MY_PRECIOUS} inside.


Okay, so, I don't think the author cares about the situation where your entire system is, like, 1) set env var 2) launch long-running network service 3) never spawn another process inheriting the environment. That's the trivial, happy case. This post is probably more useful if you look at it from the perspective of a system invoking many different processes that come and go, somewhat recursively, likely integrating many independently-developed codebases.

With non-trivial use of environment variables, you're just back to arguing about dynamically-scoped variables vs. lexically-scoped variables (comparing environment variables to lexically-scoped global variables seems besides the point). There are uses for dynamic scope and it keeps getting reinvented, but we know it comes with gotchas and typically prefer lexically-scoped variables.

eg., environment variables are handy because you can pass them to your grandchildren without the direct children needing to know about them. Environment variables are hazardous because you can inadvertently pass environment variables to your children without knowing you inherited them from your parent.

Environment variables can save you from having to teach your program to pass on the correct configuration. Environment variables can damn you when your children rely on them and hence omit facilities to propagate configuration in some nuanced manner. etc.

I'm taken aback by the rancor in our comments. It shouldn't come as a surprise that people developing different kinds of software arrive at different best practices. I'm enjoying this kind of post much more when resolving the cognitive dissonance by trying to understand where the other side is coming from and how the experiences that shaped our respective aesthetic intuition differ.


Or you could store your config as an environment variable in JSON, like Cloud Found does. I mean, JSON is really just one big string after all...


Now every program and script needs a full JSON parser and to all agree where this JSON is located in the system.


> Environment variables is exactly this: mutable global state.

Not really; environment variables can't be externally changed once they're passed in. If it were true mutable global state, some external process could change their value while a program is running. Or the program could change the value of the environment variable in the shell that launched it. Either of those would be horrifying.

In other words, environment variables could be worse. They could also be better. My main objection to environment variables is that they're not a discoverable interface. Ideally, it should be possible to find out what environment variables a program accepts without running the program (or grepping the source code for "getenv"), and in most cases it should be possible to enumerate the valid values.

Environment variables are just one of many things about they typical POSIX / ANSI C interface that have been around so long people don't even question it anymore.


> For comparison using JSON configuration files this entire class os problems would not exist. Every application would read the data in the same way

The issue they mentioned earlier remains, as it's not specific to env vars. What if you don't supply a config file?

> The environment is now different. What should the program do? Use the old configuration that had the env var set or the new one where it is not set? Error out? Try to silently merge the different options into one? Something else?

I'm really not a big fan of using env vars for configuration but there are cases where it's a good choice, for example running docker compose and passing a .env file that becomes the configuration for the container.

I agree that it's not a very good way to configure an application in a "shared" environment. For instance, I'm not a fan of curl reading my HTTPS_PROXY env var and implicitly using that value. That should be controlled exclusively by a cli switch imho.


Moving from tokenising files to end vars was one of the best choices I ever made. Life is sooooo much easier using env vars


> environment variables are mutable global state, therefore bad

Wrong. They are immutable global state. There is no way of updating a process environment set after it's started. Apart from corner cases (execve), env vars are immutable, therefore act like constants. And constants aren't bad, are they?


The most common problem that I've encountered with using environment variables for configuration is that developers often add them ad-hoc, in the module where the configuration is needed, making it impossible to have a clear idea of all the configurable options in a program.

The solution is not to get rid of env vars, but to have a mechanism that centralises the confirguration in one place; this way all the options are easy to locate, and they can be defined in multiple ways.

Shameless plug, FWIW: figga is my humble contribution to this area for Python: https://github.com/berislavlopac/figga


The author says: // to call it you'd do

first_argument = 1;

second_argument = 2;

int three = add_numbers();

This is, I trust you all agree, terrible. This approach is plain wrong.

I agree, but I don't think everyone agrees. Just have a look at the apple APIs, their configuration options are all objects you have spend time configuring, then pass as an argument. I don't know why everything with Apple development beats around the bush. It's almost too object oriented.

To be clear, Apple is more like this:

let argument = Argument()

argument.first = 1

argument.second = 2

let addRequest = AddRequest()

adder.add_numbers(with: arguments)

let mathRequest = MathRequest()

mathRequest.perform([addRequest])

if let results = addRequest.results {

    // Do something with your results.
}

And yes though, Apple development makes me feel smart when I solve something that should've been documented.


I think you're objecting to different things than the author.

The author is complaining that there's no link or information that first_argument or second_argument are used by add_numbers.

In comparison, in the Apple API example you've given, addRequest is explicitly provided to perform.

What you appear to be objecting to is setting up potentially complicated config objects to pass rather than passing simpler arguments.


I never understood why an environment variable makes sense. A type safe configuration in something like Dhall makes much more sense. For secrets environment variables also a terrible idea. An encrypted vault is much better.


You can have both: write your code in a way that you pass it’s configuration. This makes code more testable and reusable.

Now in the outmost layer where you start your program you get the environmental variables and pass them.


I don't mind environment variables for config, but it really bugs me when libraries read them. I'd much prefer to front-load them into my app's entrypoint where I can see and control them.


After years of struggling with config file in heterogenous production environments, I'll argue the opposite: environment variables are the BEST option to manage your configuration.


I don't have a strong opinion on this, but a comment in the article lead me to this discussion, which is just wild: https://github.com/ninja-build/ninja/issues/1482 I get a sense that the maintainers hold a similar sentiment as the author of this piece, but it's never expressed, but the frustration feels real.


Interesting: I went into reading this thinking in the context of programs. However, reading the replies here has made me realize it wasn't constrained to just that. For programs, there are better alternatives, especially if you're using a deployment pipeline that has things like key-value stores. In general, environment variables are great, especially for things that have access, practically, to not much else, like shells.


Relatedly https://boakye.yiadom.org/bikeshed/env/

In short, I'm in full agreement with the author on not implicitly using magic variables/values in our programs. Environment variables are fine so long as they're not read from within the depths of a function.


OpenMP[0] wants to have a chat with you about the usage of environment variables for runtime configuration of applications and dynamic adaptations of software to environment they run in.

[0]: https://www.openmp.org/spec-html/5.0/openmpch6.html


If my program needs more than a couple arguments, I prefer a YAML configuration file. I usually try to enable environment variable substitution in that file. This makes configuration more natural when/if the program is run in a container (you don't need to volume-mount a file); and it gives the user flexibility in how they configure the program.

To each their own, I suppose.


This makes sense. You can use a config file and a cli param to point to which config file. However, env vars have worked very well for me and so there's this difference between theory and practice. I haven't found the reason but ultimately the model that the AWS CLI uses is very ergonomic (default config < env var < CLI pointer to specific config).


I think there are some valid arguments to be made against the use of envvars for configuration, but none of them were in this article.


I have only very rarely used environment variables in my own software over the last 20 years. The argument for their use that I have heard the most frequently has been "you would have to be an idiot to not use them" which has thus far failed to convince me. That, and its little brother "you must never have pushed anything to production" seem to find some echo in this thread.

From what I could gather, the actual reasons that Other People use environment variables seem to be:

- Other People use some tool for a reason unrelated to environment variables, and that tool happens to make environment variables extremely easy or convenient. A more pessimistic alternative: that tool makes everything inconvenient except for environment variables. Whatever that tool may be, I suspect that I am not using it.

- Other People write programs that are invoked (I write programs that run continuously), and environment variables improve the ease with which programs can be invoked, which is a concept I never have to deal with.

My programs load their configuration from files at startup ; configuration files are committed to git and included in the build ; secrets live in a remote key vault, with a local cache in case of unavailability. Maybe in the future, I will discover something that environment variables greatly improve about this. I don't know.


His example:

  $ SOME_ENVVAR=... some_command <args>
Is wrong. SOME_ENVVAR is set only for the invocation some_command; at the next prompt, the value of SOME_ENVVAR is what it was before the invocation. To persist the setting of the value you need to export it:

  $ export SOME_ENVVAR=... some_command <args>


> Environment variables is exactly this: mutable global state.

Wait, a large point of configuration is exactly that they are read-only.


If you're a heathen like me, you sometimes store JSON configuration in environment variables https://mitchum.blog/how-to-store-json-in-an-environment-var...


Storing the name of an "environment" in an environment variable and then using that to pick the appropriate configuration file is what .Net Core does for web apps and that seems to work fairly well.

Then, of course, you have the fun of Azure Application Configuration and KeyVaults - which are fine once set up and your app is happy with them.


there's an astonishingly silly GitHub issue thread linked in the comments of the post: https://github.com/ninja-build/ninja/issues/1482


That was a soap opera. I particularly liked the subplot where people started flaming each other about different versions of C and the use of goto.


That is... quite the issue thread. So many people to be annoyed at in there!


I think if machines are treated like single purpose cattle then environmental variables are fine.

But for a single system with lots of different applications and functionality going on, you kind of run into the same problem as you have with mutable global variables.


int first_argument; int second_argument;

void add_numbers(void) { return first_argument + second_argument; }

// to call it you'd do first_argument = 1; second_argument = 2; int three = add_numbers();

> This is, I trust you all agree, terrible.

This isn’t really a case against environment variables per say but against global config variables generally. Whether it comes from the environment or a config file that state will be global. So the example of adding globally defined integers is obviously silly, but if you change those variables to a username and password strings then change the function to db_connect it makes sense.


Is there any good guide on how to use environment variables as config properly, especially on managing and deploying them to servers ?

I am having hard time figuring these out. So I rarely use environment variables.


Keep it in one place and have an easy way to disable setting configuration through ENV variables altogether. Don't let it get out of hand by sprinkling it everywhere, that's all.


> An environment variable can only contain a single null-terminated stream of bytes. This is very limiting.

The same is true of command-lines and files. Computers can only use bytes.


A files' contents are not effectively smuggled just because halfway through a file a NUL is in it, unlike environment or arguments.


"This is the way we have always done it so it must be correct!" != "This is the way we have always done and it works perfect for everybody!"

I want to use env vars because:

* I want to reuse builds and my systems have different behavior on different environments.

* I want to override the default config sometimes

* I don't want to expose sensitive configurations, and yet have them explicitly configure in my project configuration

The post it's very opinionated and lacking any valuable point IMO.


Great post! A few months ago I wrote this about my experience with this (mostly based on my experience writing Go): https://henvic.dev/posts/env/

IMHO almost always (with exceptions) pointing to a secret/value in a file is better. For all other things, flags passed on arguments are better.


The article is not against env vars per se as the title suggests. Instead it makes a case against undocumented features and improper use of env vars. It also complains about env vars being plain strings but fails to recognize config files are just plain strings without the appropriate parser. What stops anyone from putting JSON arrays in env vars?


sure env vars are overused. some values simply do not belong in the environment, for example - credentials. however, i havent seen a more convenient method of injecting configuration dynamically without some error prone file watching mechanics


Why don't credentials belong in the environment? They're definitely not visible to other users there, as opposed to on the command line where they definitely are, or in a file where they can be if you don't set the permissions correctly.


Because with credentials in environment we now have to distinguish between private and public environment vars so we don’t accidentally log something we should not or expose security sensitive info in another way


I think you're trying to say: don't use environment variables as a database.


Somebody never heard of CGI?


Another nice one is chdir() and how it behaves between threads.


When you realize threading was added on late in the game with Unix, this is hardly surprising. Unix is process oriented.


But I keep putting more and more env vars into my programs. Enough command line switches already, but for debugging I always fall back to undocumented getenv's, rather than adding and removing silly -Dhexmasks.


Calling env vars 'global mutable variables' is pretty misleading. The example C code is a terrible strawman, and makes it appear that the author is simply unaware of some core compsci/programming concepts.

Environment variables behave much more like dynamic variables (AKA dynamically scoped/bound variables), rather than global variables https://en.wikipedia.org/wiki/Scope_(computer_science)#Dynam...

In particular, we can override env vars for a particular call, without affecting anything else. For example, consider the following script:

    echo "BEFORE $FOO"
    FOO=bar printFoo
    printFoo
    echo "AFTER $FOO"
If we run this script with 'FOO=foo', and assuming that the program 'printFoo' simply prints the 'FOO' env var, we will get:

    BEFORE foo
    bar
    foo
    AFTER foo
The variable 'FOO' wasn't mutated (since the value remains the same across lines 0, 2 and 3); it's also not globally scoped (since line 1 saw a different value). Rather, each process is run with its own environment, whose initial contents is inherited from the scope that invokes the process; plus extras/overrides, like 'FOO=bar' above.

Env vars are mutable, and mutating an env var acts differently to mutating a dynamic variable: dynamic scope looks up variables on the call stack, so mutations high up the stack will be visible after returning. Instead, env vars are copied from parent to child at each new scope (process), so mutations are only visible to that process and any subsequent subprocesses. In fact, that makes env vars even less 'globalish' and 'mutableish' than ordinary dynamic variables!

For example: if 'printFoo' finished by mutating 'FOO' to equal 'baz', it wouldn't affect the above script at all. Even line 2, which "inherits" the script's FOO, would only be mutating its own copy of 'FOO', which doesn't affect the script's variable.

In any case I highly recommend to avoid mutating env vars, for the same reason I avoid mutating any variables, regardless of language; unless there's a specific reason to. If we treat env vars in an immutable way, then they act exactly like dynamic variables.

As an example of dynamic variables, consider the following Lisp code:

    (write-line (concatenate "BEFORE " FOO))
    (let ((FOO "bar"))
      (printFoo))
    (printFoo)
    (write-line (concatenate "AFTER " FOO)
The '(let ((FOO "bar")) ...)' construct acts like the 'FOO=bar ...' of the script.

Not only do I find env vars very useful for config, I also find dynamic scope is very under-utilised in "proper" (non-shell) languages. For example, dynamic scope is a great way to do dependency injection: rather than passing around extra arguments, or adding a bunch of private fields to objects, etc. we can just reference the dependency with a dynamic variable, and open a new scope whenever we want to set its value (at an application's entry point, or in a test, etc.)


appdirs


This blog post is so wrong and out-of-touch that it boggles the mind how it popped up in HN. The blogger's strawman to criticize the use of env variables is so mindless and outlandish that casts some doubt regarding the author's experience and know-how.

Meanwhile I'll just say the following:

* Any config input, whether passed through env variables or config files or command line arguments or external services or telepathy, is handled by clients. Sure, anyone is free to directly call getenv() from whatever corner of the project, but to handle any input properly you need to use dedicated clients to handle common usecases such as input validation, logging, provide sane defaults, and layer config settings.


I lol-ed.


Damn, this guy has never worked a production environment before.


I have, and I think he's right.

I don't think picking apart the (assumed) experience of the author in this way is conducive to a high quality discussion.


Meh. Most configuration libraries I've used make all of these issues into non-issues. It's very useful to be able to set configuration with environment variables, even if you're loading from a file.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: