Don’t use environment variables for configuration

erulabs · on April 1, 2021

I disagree with this post so strongly - having spent most of my career installing, configuring, and managing other people’s software.

> The answer is that you, the end user, can not now. Every program is free to do its own thing and most do. If you have ever spent ages wondering why the exact same commands work when run from one terminal but not the other, this is probably why.

If the same program is behaving differently between two systems - it can -only- be the environment that’s different.

> Instead of coming up with a syntax that is as good as possible for the given problem, instead the goal is to produce syntax that is easy to use

Oh to be a developer. The “best possible syntax” is a universe of possibilities - environment variables are, thankfully, limited to strings. If all programs had to be configured with a Turing complete config language - that would just be a programming language! Limitations can set you free.

Sorry for the harsh tone. Please, and I believe I speak for most sysadmins, please continue to use environment variables.

BatteryMountain · on April 1, 2021

This is a voice of reason, not harshness.

One of the projects I currently work on, the configuration system/model/table is a monster.. I wish it was just strings. It basically contains boolean flags, strings, numbers etc but the most insidious one is, it can contain groups of related configs - meaning people started dumping stuff into that should be a normal table (ex, ShippingType:, Road, Rail, Air), so we get no foreign key constraints for reference data. This caused them to implement soft-deletes for it, so now you have some config values that float around forever because they were referenced somewhere (aka pseudo-foreignkey). It's so utterly dumb I want delete the whole thing but everyone thinks it works great (non-tech people). I'm a developer, not a system-admin, but this is too much. Doesn't help that 5 different people has added 'features' to it over the years. The same can be accomplished with something waaaay simpler/cleaner. We also keep having config-related issues where people blame the system/servers/devops etc... every single time it is due to misconfig the project, so the system admins cannot do anything about it anyways (meaning we keep wasting their time, thinking the problem was with the servers). Let me pause here, need to take a blood pressure pill.

cle · on April 1, 2021

> The “best possible syntax” is a universe of possibilities - environment variables are, thankfully, limited to strings.

The same can be said for command line arguments. I usually prefer those instead of environment variables, because they must be explicitly specified instead of being implicitly passed by the parent process. I think of env vars as being more useful when repetitively calling commands interactively, to save some typing, or when you really do want processes to inherit config from parents (like PATH and its variants).

But overall I still agree with your sentiment.

junon · on April 1, 2021

100% this. The developer behind Prometheus was a huge dick to people about env vars a while back, in similar fashion. Just the other day, they held another closed-doors vote after a year or so and finally decided they were OK.

Doesn't surprise me the creator of Meson of all people made the same dogmatic assertion.

What a circus this industry has become.

darkwater · on April 1, 2021

Any link about that? Are you talking about this? https://github.com/prometheus/prometheus/issues/6047#issueco...

seedie · on April 1, 2021

It's fascinating how you can do this job for decades and learn about new tools daily. And I mean tools that are here for ages.

I just learned about envsubst https://www.gnu.org/software/gettext/manual/html_node/envsub...

junon · on April 1, 2021

This was the original discussion that spanned quite a long time. https://github.com/prometheus/prometheus/issues/2357

edem · on April 1, 2021

I came here for this comment. I'm a developer and I have been using env vars for quite some time after I got burned numerous times by other options. I just can't see why I would use anything else.

magicalhippo · on April 1, 2021

I have no skin in this game, so to speak, so just curios.

What kind of ways did you get burned by other things than env vars in a way in which env vars would not?

edem · on April 1, 2021

Mostly configuration stored in the database (who configures the database) or 3rd party configuration services without an SLA and config files that are either present or not. Env vars are really simple since they are completely decoupled from the app and you can have default values for all of them. You just need a single class that loads all config on startup and you can go from there (or fail if you don't have the mandatory vars).

KptMarchewa · on April 1, 2021

>who configures the database

or, what configures the database connection

edem · on April 2, 2021

I forgot the question mark: (who configures the database?). So it is about the problem that if you have your configs in a database then you will carry the additional burden to separately set up databases for each environment....it is like trying to put the hash of an image on the image.

emodendroket · on April 1, 2021

They also work well with serverless environments, containers, etc., which can’t be said for some alternatives.

HelloNurse · on April 2, 2021

Environment variables are also more portable and cheaper/simpler than a DBMS, a LDAP service, an application server's proprietary configuration repository, etc. without being weaker at specifying simple configuration values.

In practical terms, specifying environment variables in cleanly isolated and composable layers (defaults by user, a specific terminal session, a script that call another script or the useful program) is a major advantage over more enterprisey and monolithic mechanisms.

exclusiv · on April 2, 2021

Bingo.

The author's title is unfortunate: "Never use environment variables for configuration"

Not everyone is writing CLI scripts. Some are writing multi-environment software for the web. Some people care about git and distributed teams.

Let's say you backup and migrate a db which has config in the db. Well your other environment is using the wrong config! Now you're possibly using prod SMTP credentials and sending notifications to the wrong people because all you wanted to do was have live content and do some testing or show debug messages because you're on a test environment.

Web frameworks which have the HTTP host in the database drive me nuts. Why do you need that? Your webserver is responding on a domain name. Why do some store the absolute url in the db. It makes no sense. For the 3 people that want to serve up domainA.com but have all links readily be domainB.com maybe that makes sense.

Don't get burned people. For most things, just ignore this article, keep your secrets out of version control and keep your code ready to deploy to multiple environments. Decouple that. Your app should be able to work with various configurable services (SMTP, push notifications, database, etc) and you don't want that config in version control.

So either a file that's outside of version control with variables you set per environment or actual server environment variables.

Environment variables in many web languages are just namespaced globals. Still better than global variables. They serve a purpose.

The author has this:

int first_argument;

int second_argument;

void add_numbers(void) { return first_argument + second_argument; }

While it helps the author's agenda, that's not a legitimate example.

A real example would be a service provider which a developer would understand to have the sole purpose of pulling from environment or config values to initialize.

You wanted a Twilio client? Well, we know it needs some keys. Use environment variables. Boom. Everywhere you ask for this Twilio client you get the same instance with the config that that environment needs.

This is not a problem. It works well. Better than any alternative.

Author also says this "Environment variables is exactly this: mutable global state."

For some CLI applications maybe. But for web languages this is not accurate. Some languages, yes, you can mutate the env variables at runtime. But your code shouldn't rely on any mutable env vars. You can load those env vars into a config. Even cache it. Not allow mutations. Many modern frameworks support this or you could implement it yourself.

And if you have a CLI that really needs to be explicit about the environment vars to run? And not be different the next time you call it with no env vars? Well maybe take those as arguments?

Plenty of CLI apps that store config in json and have a wizard to set those credentials. AWS CLI as an example.

Now you got JSON as the author wanted. That can work across platforms. But guess what? You work on multiple clients and hosts and next time you call it? Well you might not be expecting that it had config for another client or app.

Same problem. The state was mutated. You called it again. You potentially got burned. Now you have to do the CLI wizard to reconfigure. Or you pass in explicit params if allowed.

Definitely a consideration if you are writing such a program. Do you make it explicit with arguments and options? Load from ENV vars? Have some CLI wizard and save to JSON?

The author's suggestion of JSON is no reason to toss out ENV vars. Just solves slight differences between Windows and Unix. Which is why you can program a CLI wizard if you care about that problem.

We don't need to say no to env vars because we want CLI app users across different platforms to have the exact same API. Setting up a JSON file is really lame to use a program. Which is why you see cli wizards when you run them.

Or things like "aws configure". And you still have CLI arguments and options availabe.

edem · on April 2, 2021

I left the question mark out. It is "(who configures the database?)".

hjek · on April 1, 2021

> If the same program is behaving differently between two systems - it can -only- be the environment that’s different.

Yes, but the entire problem is that "the environment" is massive as it includes all hardware and software running on the device in question (and quite possible other devices as the network can easily be considered part of "the environment") which makes it difficult to track down differing behaviour.

"The environment" is not just environment variables. I've run into a spreadsheet bug where I got wrong results because of a CPU bug. Just because some global mutable state exists, that doesn't mean it's a good software design to have program behaviour depend on it.

fatnoah · on April 2, 2021

>If the same program is behaving differently between two systems - it can -only- be the environment that’s different.

As a developer, I've generally used configuration files for changing the operation of my software and environment variables for information my code needs to know about WHERE the code is running. For example, the same code doing the exact same thing on 10 machines would have the same configuration file (or command line parameters in simpler cases) but the environment variables may change from machine to machine.

smitty1e · on April 1, 2021

I think the post has some merit if we differentiate strongly between what is truly external to the code.

The article's point stands if we're being lame and treating internal code details as external.

yawaramin · on April 2, 2021

> Oh to be a developer. The “best possible syntax” is a universe of possibilities - environment variables are, thankfully, limited to strings. If all programs had to be configured with a Turing complete config language - that would just be a programming language! Limitations can set you free.

I was thinking about this while reading the post. What's the best syntax language or syntax for configuration, if environment variables are too simplistic and full-fledged programming languages are too powerful? That's why I'm interested in Dhall, which is a configuration language that's limited to compile-time only–i.e. it can't do anything at runtime.

emodendroket · on April 1, 2021

There was quite a vogue for using programming languages to configure things for a while. It still has its place I think.

kgwxd · on April 1, 2021

I feel one good argument against envars is loggers might log them, but Ive never been bitten by that myself.

exikyut · on April 1, 2021

Chromium with --v=1 will indeed spit out the API keys it's been configured with, and the debug output by default gets logged into a file.

ajarmst · on April 1, 2021

This attitude is how we ended up with Active Directory. One assumes that the author doesn't spend much time in a command-line environment. The option to set defaults for one's normal working environment is obviously of value, and environment variables are hardly exclusive of configuration files (~/.bashrc, makefiles, ...). The theory that complex syntax and statelessness are inherently good and cost-free is naïve to the point of parody.

The author is correct that "this is the way we have always done it" isn't a good argument in and of itself to persist in a practice. However, they might be rewarded by a few minutes pondering a related idea: "if generations of people---many quite capable of modifying the system to use something else---persist in using something, it's possible they have a reason for doing so other than a deficit in competence or imagination."

_ktx2 · on April 1, 2021

On the facade environment variables may seem like they're orthogonal to global variables but they're not.

Environment variables are scoped to the current process. This could be your shell, but it could also be a web server. This doesn't make them leak proof, but unlike global variables, environment variables have a scope.

Environment variables are also used more widely as an API. Many CLIs have a command that when self-executed can act as a persistence layer in your shell. This functionality would be impossible without environment variables.

You mentioned the other motherload, which is that environment variables are quite often, again, used to communicate in Makefiles. You can see the natural scoping if you start to kick off an ad-hoc shell process inside a Make target.

gnulinux · on April 1, 2021

I don't get this. From the same perspective you can argue global variables have scope too since they are "scoped to the current process". It's not like other processes can access your global variable, that's a very low bar for a scope.

TrueDuality · on April 1, 2021

The difference here is that a running shell (including the shell environments that services run with in) are an abstraction layer about managing processes. Within that abstraction layer, each process can have its own unique environment variables that they can change and manipulate independently. That makes them not global.

Within a process the top level abstraction is the process itself, and anything underneath (class, method, function) will be impacted if another sub-abstraction makes a change to a global variable.

gnulinux · on April 1, 2021

Makes sense, thanks.

amelius · on April 1, 2021

> It's not like other processes can access your global variable, that's a very low bar for a scope.

They can read them, in /proc/[PID]/environ

beebob · on April 2, 2021

Interestingly, reading this blog post, this doesn't seem to be common knowledge. The first comparison of global variables inside a process and environment variables left me wondering. It just felt wrong.

A gripe I have with environment variables is when they are used to modify a programs behavior deep inside it's belly and aren't treated like configuration input similar to program arguments.

Otherwise they are a universal way of configuring applications. Universal is good, universal is nice.

michaelmior · on April 1, 2021

> Environment variables are scoped to the current process.

Obviously this is true in a sense. But for practical purposes, it depends on how the environment variables are set. For example, if I set them in my ~/.bash_profile, they're scoped to all bash processes that my current user runs. If I put them in /etc/profile, they're effectively global.

_ktx2 · on April 2, 2021

Depending on the variable, sure, but that's a wide latitude in the interpretation of "global". They're not linked or the same memory in any form or fashion. They're merely identical in value.

michaelmior · on April 2, 2021

True. I was thinking in the context of software deployments where environment variables are commonly used as read-only configuration values once the deployment is made.

merciBien · on April 1, 2021

@ajarmst is that last related idea you quoted your own? That’s a powerful expression of how I see human culture: question everything, but remember to respect the ideas of the people who came before you. There might be a baby in the bath water you’re discarding.

salawat · on April 1, 2021

It's a restatement of the lesson of Chesterton's Fence.

There is a fence somewhere you wish to get rid of. Nobody around knows why it was there in the first place. No one will let you tear it down until somebody figures out why it was put there in the first place.

The important take away, is that before changing something, one should understand the history of a thing, or one runs the risk of being a victim of the same problem the unknown thing was put in place to address.

Life is way more complicated than the umwelt of any one individual, so it is not safe to just change something without doing the footwork to understand what led it ro be in the first place.

Demonstrate that work has been done, and generally, no one will get in your way.

ajarmst · on April 1, 2021

The sentiment certainly isn’t original. The phrasing sounds like me, though. I can get pretty pedantic.

chriswarbo · on April 1, 2021

Obligatory eponymous law: https://en.wikipedia.org/wiki/G._K._Chesterton#Chesterton's_...

bmh100 · on April 1, 2021

What do you mean about Active Directory?

zajio1am · on April 1, 2021

Strongly disagree. Environment variables are, IMHO, best tool for some simple configuration in unix. They match perfectly with behavior of the ecosystem and other tools in it (like unix shell).

Yes, if your OS is some unversal JS machine, then JSON would be better, if it is Lisp machine, then you would use S-expressions, but on Unix machine, environment/args are way to go.

There are two realistic alternatives - config files and arguments. They have each their own niche, where environment is somehwere between them.

Arguments are better for one-shot setting, not for some setting used always. You can use 'alias' to define shortcuts that always add some argument, but that is definitely more cumbersome.

Config files are good for always/default setting, but are too rigid. Changing config files is equivalent of changing global variable in code, it has system/user-wide effect. While i can just change environment in this one shell and it will affect just commands executed from that shell. Also, config files are much harded to be manipulated from scripts, and use different syntax for each tool.

Perhaps the ideal tool would allow every option to be set/changed from config file, environment and argument.

yrro · on April 1, 2021

Why I don't like environment variables:

1. I worry about programs dumping all their environment variables to log files - credentials are now on disk, ingested into log storage...

2. Environment variables are inherited by child processes by default. This is undoubtable useful. But it can also cause problems.

I wish the ghosts of unix past had forseen the need for a way to mark particular variables as, say, 'sensitive' and 'noexport', allowing them to opt out of the default behaviour.

It would have been so say to say "variables starting with _ are not inherited and should be censored when output", but we're about 40 years too late for that to catch on...

OJFord · on April 1, 2021

I used to agree with (2), but now I think Meh, it's an implementation detail whether the program uses my environment variable 'directly' or with a child process, it's not meaningful to make that distinction.

When it is meaningful (and this is supported today) is to set them just for specific programs/invocations, rather than exporting for a long-running interactive shell (and everything within it) willy-nilly.

More innovation around making that easier would be interesting, env vars that should be set specified by program, isolated from others, for example. So `foobar` would actually get executed like `FOO_SECRET=hunter2 foobar` without specifying it every time or having it exported in the shell, and in a generic way not specific to each program's config.

It's not really related but for some reason 'summon' is on my mind as a tool to mention. I haven't used it in anger yet, but it is interesting. It's not quite this though, or at least, it solves only the 'storage' part of the implementation of what I described, not the 'orchestration' or mapping of programs to vars/summon invocations.

jschwartzi · on April 1, 2021

This is pretty much how systemd works. You can specify secrets that are retrieved from somewhere else and provided to the process in the environment it is started with. So you could do exactly this with the right unit configurations.

OJFord · on April 1, 2021

Ha, funnily enough I mentioned systemd and then deleted it. I do run as much like that as possible, I just couldn't succinctly explain why I thought it was different or better than putting:

    VAR=whatever process

in .xinitrc or wherever.

Lvl999Noob · on April 1, 2021

Iiuc you are advocating for setting env vars at the call site, like `FOO_SECRET=hunter2 foobar`? In that case, why not just use command-line args and call it like `foobar --secret=hunter2`?

OJFord · on April 1, 2021

I wasn't advocating for it in preference to args, but there are circumstances where that's not possible, for example calling some CI/CD tool (say terraform, ansible, fabric, whatever) that doesn't consume the var itself but uses something that does.

It's also a more convenient/already generic interface for doing something consistent across multiple programs.

PeterisP · on April 1, 2021

Because in the latter case the commandline of the executed process (which may be exposed in various places, including a simple process list) is `foobar --secret=hunter2` and in the former case it's just `foobar`.

npsimons · on April 1, 2021

My first thought on the headline are specialized concerns of the above: environment variables are an attack surface. If you use them for configuration, it's all too easy for an attacker to modify them without the victim knowing. Just look at issues with LD_PRELOAD: https://attack.mitre.org/techniques/T1574/006/

That said, I agree with GP that environment variables are super useful and super simple. But I've also been burned more than a couple of times by setting something in the past and then having it caused unexpected bugs that are hard to trace down as they aren't in my working memory. They're a double-edged sword, to be sure.

jschwartzi · on April 1, 2021

There's actually a long list of variables that are unset when invoking sudo to prevent these kinds of attacks. Systemd will also start programs with a very minimal environment that isn't inherited from any shells. You then have to specify environment variables explicitly as part of the unit file. You can also specify environment variables in environment files.

HelloNurse · on April 2, 2021

1. If your program is chatty, it can be chatty in the same way regardless of where the improperly logged secrets come from; it's still your fault for being coarse and lazy. There's little difference between logging all environment variables(and/or all command line parameters) and logging the whole configuration object.

2. If your child processes shouldn't inherit environment variables, set them properly. The "ghosts of Unix past" have "foreseen the need" for execve(2) and execveat(2), which don't pass anything by "default".

rascul · on April 1, 2021

> I wish the ghosts of unix past had forseen the need for a way to mark particular variables as, say, 'sensitive' and 'noexport', allowing them to opt out of the default behaviour.

The default behavior is a non-exported variable. If you want child processes to see it, you must export it.

yrro · on April 2, 2021

There is no such thing as an exported or non-exported environment variable. In fact, as the kernel is concerned, there is no such thing as an environment _variable_ at all, just a block of data.

See execve(2):

"envp is an array of pointers to strings, conventionally of the form key=value, which are passed as the environment of the new program. The envp array must be terminated by a NULL pointer."

You can confirm this my examining the the environment block that was passed in to your current shell with:

    < /proc/$$/environ tr '\0' '\n'

What you're referring to are actually "shell parameters", some of which may be marked for export. When the shell starts up, it parses the environment block and sets parameters based on what it finds, marking them all for export. And the shell uses only the parameters marked for export when constructing the environment block for a child process (which is passed to execve(2)/execveat(2) in the envp argument).

rascul · on April 2, 2021

I guess I assumed you were referring to environment variables in the context of the shell. Apparently I was incorrect.

intrepidhero · on April 1, 2021

Isn't storing credentials in environmental variables bad practice to begin with?

emanlin · on April 1, 2021

What better place is there to store credentials?

falcolas · on April 1, 2021

A config store like Vault. Of course, that needs credentials too, which are typically a file on the file system.

IMO, people are overly sensitive about environment vars. They are really no worse than files on the file system - both can be accessed if you're a privileged user on that machine.

KptMarchewa · on April 1, 2021

Vault should be source of those env variables. Via some predefined initcontainer or something like that, to which devs don't have access to.

danielhlockard · on April 1, 2021

Or you could, you know, auth to vault and pull the creds from vault inside of your app?

tylerchr · on April 1, 2021

You could, but then you’ll have replaced a universal and standardized abstraction with a hard commitment to one very specific approach. That doesn’t come cheap.

falcolas · on April 1, 2021

One thing that I like, which this approach allows for, is live configuration. For things like databases and such which allow for the regular rolling of credentials.

It's not simple by itself, but it simplifies other things.

emanlin · on April 2, 2021

How do you auth to vault?

LanternLight83 · on April 1, 2021

Via either program config files, an inline subshell calling cat, or ssh-agent in that specific case, to keep credentials both out of the environment, and off of the command-line where it can be read by inspecting the resulting process for it's invocation.

emanlin · on April 2, 2021

All of those places can also be read.

SSH agent is a good example. It’s effectively an environment var which is why this works fine:

  sudo SSH_AUTH_SOCK=$SSH_AUTH_SOCK git clone ...

Edit:

The reason I think it’s silly to make a blanket statement environment vars are bad is because too many containers have credentials baked into the image when they should be passed in another way.

emj · on April 1, 2021

You can disable access the possibility to read the memory of other processes and you can do it for environment variables. Storing access tokens in memory is more obscure than environment variables, that is true though.

whatshisface · on April 1, 2021

Store a path to the top secret file in an environment variable, and have the program read the credentials out of the file. Put the file somewhere far away from the repo, on the deployed filesystem.

yrro · on April 1, 2021

Yes, for those reasons

darksaints · on April 1, 2021

Compared to what?

jrwr · on April 1, 2021

depends on who you are talking to, since you run the risk of committing the creds to a git repo

bopbeepboop · on April 1, 2021

How do command line arguments or config files solve either problem?

yrro · on April 1, 2021

Command line arguments aren't inherited by child processes. Unfortunately they are visible to other users on the system, so they're no good for credentials.

Config files (or an abstraction of them such as reading config data from a socket), after parsing, result in some credentials sitting in the memory of the process that needs them. They aren't in the environment block, so a quick and dirty "dump all my environment variables to stdout" procedure won't risk exposing them. And for the same reason, a child process that does the same won't inherit them in order to expose them.

Note that talking about preventing accidental exposure of credentials. Config files alone can't protect credentials from a malicious process that deliberately goes looking for them to leak them; for that additional measures have to be taken... but environment variables aren't part of the solution!

hajile · on April 1, 2021

It's also easier to encrypt a config file and provide the decrypt key externally.

PeterisP · on April 1, 2021

So then the real secret is the key, which is provided "externally" ... how? Through command line parameters, some other config file, or environment variables? :P

IMTDb · on April 1, 2021

Another encrypted config file. Obviously.

snicker7 · on April 1, 2021

Encrypted files all the way down.

Someone · on April 1, 2021

A file that’s protected appropriately or standard input.

bopbeepboop · on April 1, 2021

> Unfortunately they are visible to other users on the system, so they're no good for credentials.

If you’re concerned about your own software logging credentials, command line arguments are negative in two regards:

They’re highly visible when the process is running; they’re often automatically logged.

> They aren't in the environment block, so a quick and dirty "dump all my environment variables to stdout" procedure won't risk exposing them.

Okay — but the usual way that happens is “dump my config object in a log”, which parsed configs don’t help with.

You also now have a config file: how is it stored? ...is it in the repo? ...what are the permissions? ...how do we deploy it?

Environment variables don’t persist in repos and are designed to be integrated with hosting tools, like secrets managers.

I’m not seeing how a config file beats Kubernetes injecting from the secret store, which is why we use environment variables: so our tools (secret stores) can configure the environment our software uses.

yrro · on April 1, 2021

Good point about command line arguments being often automatically logged! So they bad for both reasons :)

Now, if you're running in k8s then you can improve your setup by mounting your secret into your container, and have your code read the credentials from the file within the mount. This just looks like another kind of config file to me :)

PeterisP · on April 1, 2021

Embedding secrets (which should be changeable and with limited access) into container images (which should be reproducible and perhaps stored in accessible locations) sounds like not a goood idea; IMHO you definitely need the capability to have the same container use different credentials so that, for example, you can run the same container in a development or testing environment as in production, but with different credentials.

yrro · on April 1, 2021

I'm talking about doing this: https://kubernetes.io/docs/concepts/configuration/secret/#us...

bopbeepboop · on April 1, 2021

I believe they meant mounting via Kubectrl as a file, from the secret manager.

So runtime file injection.

bopbeepboop · on April 1, 2021

> And for the same reason, a child process that does the same won't inherit them in order to expose them.

Wait, this is the crux of why you think it’s more secure — but actually I see the reverse problem:

Dropping environmental variables is standard security practice, but dropping file access permissions is not. Most child processes read from the same set of files as the parents.

How would having files rather than ENVs make my container more secure, where we’re concerned with developers making mistakes (passing ENV vs passing file permissions)?

Similarly, the only proposed benefit of your idea is we don’t have them around post reading — but that’s true if you initialize a config object from ENV and then pass it around as well. (You ignored my point about how mistakes via logging happen.)

yrro · on April 1, 2021

I'm concerned with 'I have credentials in my environment block and just dumped the whole thing to a log file'. Avoiding storing the credentials in the environment block obviously avoids that.

To be fair, unsetting sensitive environment variables after consuming them would probably also avoid that eventuality. I can count on the fingers of no hands the number of times I've seen developers do that! :)

Some other part of my process (or a child process I might launch) deliberately hunting for credentials in order to leak them is a different problem with other solutions.

In between these two cases we have mistakes like "dump config object (containing credentials) to a log file". That, too, can happen and should be avoided, what more can I say?

marcosdumay · on April 1, 2021

> You also now have a config file: how is it stored?

Hum... Your environment variables must be stored at some place too, so the server can be launched. You can store the files at the exact same place.

bopbeepboop · on April 1, 2021

Sure — you throw them in the Kube secret manager.

But now you have multiple config files (smart) or your entire config outside the repo (not smart). This isn’t always the wrong approach — SSH keys get loaded this way, for instance.

ENV variables naturally provide a way to layer content from different sources in a way that files don’t, so if you have a relatively simple config from multiple providers (eg, getting AWS session token from the host plus your environment config from the launch ENVs) it’s easier to use the K-V store nature of ENV variables versus multiple files.

Again, multiple config files isn’t always wrong — but using that to store single strings instead of ENV variables is a code smell, for sure.

CGamesPlay · on April 1, 2021

> Perhaps the ideal tool would allow every option to be set/changed from config file, environment and argument.

This is exactly what the most widely used golang configuration library does: https://github.com/spf13/viper

jayd16 · on April 1, 2021

Its also what most of the entrprisey frameworks do. Spring will do this and I'm pretty sure ASP.NET has some form of it.

mixmastamyk · on April 1, 2021

I made something similar for Python, with an animal theme too, ha:

https://pypi.org/project/tconf/

kmstout · on April 1, 2021

I was taught many moons ago that configuration, like ogres and onions, is best considered in layers:

1. default values: What will most users in most places find most useful/least infuriating?

2. configuration files (system-wide, then user): What will most users on this system want most of the time? What will this particular user want most of the time?

3. environment variables: How should this session (i.e., a potentially large series of related executions) be tailored?

4. command line options: What is most useful for this particular run?

I was also taught that:

- figuring out how to go from an option to the name of a corresponding environment variable to a line in a config file should be both straightforward and well documented; and

- sometimes you need a more complex configuration than is cleanly supportable through any other method than a file. In such a case, the location of that file can itself be passed through options and the environment.

noisy_boy · on April 1, 2021

This is precisely how I setup my utilities. I have found following practices useful:

1. Print the path of the system and user level configurations that the utility honours in the help text (-h/-?)

2. For an option that can be set interactively or via environment variable, specify the environment variable name in the help text itself to provide maximum choice to the user.

3. Provide a -viewconfig option that prints out the final resolved configuration state so that the user can see the actual configuration that is in effect. Combined with a -dryrun option, this can provide a lot of confidence to the user to try out things without breaking anything.

hardwaresofton · on April 1, 2021

My guess for this is that some people have not had the good fortune of seeing software that followed this pattern and how nice it is.

I thought it was common knowlegse that if really wanted to do configuration right on a given project, you do all 4 (with some library support) and you write your code to gracefully handle the right piece of configuration from the appropriate "override level" (again usually with the support of a good library).

See also: Domain Driven Design[0] which (if you ignore the consultant-fodder and jargon that comes with it) is probably one of the best written guides of how you should abstract systems, just like the gang of four book is a good introduction to structures in program/algorithm implementation you're likely to see in real life.

[0]: https://en.wikipedia.org/wiki/Domain-driven_design

earthboundkid · on April 1, 2021

Yeah, the article is just confused:

> Envvars have some legitimate usages (such as enabling debug logging) but they should never, ever be used for configuring core functionality of programs.

As though logging weren't core functionality!

The actual thing that is bad is grabbing an environment variable in the middle of your program. You should grab all the configuration in one place and use it to configure local state that is transparently passed around. Furthermore, flags, env vars, and config files are all just maps from strings to configuration, so you should use some system that can transparently layer them on top of one another. All of my new CLIs use flags first and fall back to ENV vars if the flag wasn't set.

ziml77 · on April 1, 2021

I like layered config as well. It really should be the default way of thinking about config. Our custom application framework handles that exact sort of layering and it's wonderful. I give it an annotated class representing the config that I need and it handles populating the fields from the config and generating the help message if something is missing.

teddyh · on April 1, 2021

This echos nicely the traditional wisdom, also described here: http://www.catb.org/~esr/writings/taoup/html/ch10s02.html

llimllib · on April 1, 2021

(for default values, make sure you consider safety too! for example a debug option that might show PII is likely to be most useful to most people using the program, but shouldn't default to on because if it were on in prod the consequences would be serious)

josephorjoe · on April 1, 2021

yeah, i don't get too worked up about "how" config values enter the application as long as i can easily see "where" they are initialized/validated.

an immutable config object/class created on startup that reads files/env vars/whatever and has appropriate assertions to ensure good values were used and crashes the app for missing/bad values usually keeps things sane.

an app where each subcomponent has its own config that it gets in its own way usually leads to confusion and delay

bayindirh · on April 1, 2021

I have a pet peeve about this attitude. These methods are being used for decades and well understood with all their advantages and disadvantages.

One day, someone comes and tells that it's bad and considered harmful, and happily tells the only right way to do it. A flame war ensues then.

I'm all for moving things forward and evolution, but can't we take a milder stance and move forward in a more peaceful way? Attacking something so well established because of personal reasons feels so wrong from my PoV.

That thing wouldn't be a de-facto standard if it was too bad, right? I think we shouldn't play with the foundation that much.

q3k · on April 1, 2021

> That thing wouldn't be a de-facto standard if it was too bad, right?

I disagree. Very often, the 'easiest' option wins, not the one that necessarily is the 'best', especially in the long term. Environment variables for configurations are fast and easy, and work well on the happy path - but break down quite fast when dealing with more complex cases (eg. more complex types than strings). They have a treacherous way of seeming the simplest and most pragmatic solution at first, but becoming a an untyped, underdocumented hairball after some time in more complex software.

Furthermore, as long as environments have existed in UNIX and UNIX-like derivatives, their usage to configure the bulk of the behaviour of most services/programs are relatively new. The more old-school the service you deploy somewhere, the more likely it has a file-driven config. Indeed, sometimes it seems like 90% of the Docker code out there is converting environment variables into configuration files.

ekimekim · on April 1, 2021

> Indeed, sometimes it seems like 90% of the Docker code out there is converting environment variables into configuration files.

This is a consequence of Docker's choice of the "image" as an abstraction layer. It's not trivial to say "run this image but with this config file added" (yes you could bind mount one in, or create a new derived image, but those are both harder and come with more pitfalls).

In most common docker usage, there are exactly two ways to influence the operation of the program contained within the image: Environment variables, and command line arguments.

GoblinSlayer · on April 1, 2021

For automation you will store them in a file anyway, but then how is it different from a bind mount?

KptMarchewa · on April 1, 2021

More usually, in k8s configmap.

bayindirh · on April 1, 2021

> I disagree. Very often, the 'easiest' option wins, not the one that necessarily is the 'best', especially in the long term.

Thank you for your disagreement and discussion, honestly. Actually, I think using environment variables are a burden. Needs more documentation, more explicit warnings, a lot of handling, etc.

So, environment variables are not the easiest way out there. Especially when almost any programming language has nice config file libraries out of the box. Instead these variables are added as a convenience feature for some frequent scenarios where tool needs to adapt itself to the environment it needs to run in, just before starting or needs to be run repeatedly with small, transient changes to the config.

> Furthermore, as long as environments have existed in UNIX and UNIX-like derivatives, their usage to configure the bulk of the behaviour of a service/program are relatively new.

This is not what I see in my career. Bulk of the applications we installed and ran used some forms of environment variables for runtime configuration of the tool/application.

The reason for that the variable had a great deal of effect in the behavior of the program (which was generally scientific) and making multiple runs without modifying a file very effective. You need these runs to conduct research BTW, and you're on a cluster and jobs run long and whatnot.

TBH, most of these applications also had configuration files or "sensible defaults" and they either created their default files if there was none. And if there was a file, the environment variable was acting as an override.

So I had experimental software, fixed most of the parameters in the file and tried some other things by overriding some parameter(s) with an environment variable. Nothing was abused or misused.

> Indeed, sometimes it seems like 90% of the Docker code out there is converting environment variables into configuration files.

I've never seen it TBH, and if that's not documented well, it would be a big bag of fun for the users of that code.

q3k · on April 1, 2021

> This is not what I see in my career. Bulk of the applications we installed and ran used some forms of environment variables for runtime configuration of the tool/application.

I think we might have different backgrounds and considerations as to what counts as 'oldschool'? Maybe I shouldn't have extrapolated this to pre-2000... So, my experience comes from working with the following 'mood' of services:

  Postfix, Exim, qmail, slapd, PostgreSQL, MySQL, FreeRADIUS, Apache, Nginx, ...

All of which have their own config file/files, format, etc. All of these system-wide services, not user applications. And I think that's the main difference? I tend to deal with software that is deployed in isolated environments, be it by root users on production server, or by whoever in a containerized environment. And not deployed on an interactive systems, to then be started/reconfigured by users running on the same system.

> I've never seen it TBH, and if that's not documented well, it would be a big bag of fun for the users of that code.

Check out the list above on Dockerhub. I'm not sure all of them are dockerized in this way, but at least a handful of them are.

bayindirh · on April 1, 2021

I've further clarified my PoV here [0], but it won't hurt to reiterate. I'd be happy in fact.

> I think we might have different backgrounds and considerations as to what counts as 'oldschool'?

Most probably. All of the software you mentioned (maybe except Exim4) is actively used in our environments, quite a few of them are in very vanilla configs, and some of them are customized to the point of abuse. However, it's worth mentioning that all of the software you mentioned are in support roles in our scenario, they're the so-called side dish which we configure once and leave alone for a very long time.

> Maybe I shouldn't have extrapolated this to pre-2000...

I've started with a C64, please. :)

> All of these system-wide services, not user applications. And I think that's the main difference?

Yes, the tools I've talked about are userspace programs, and are not daemons 99.999% of the time. So you need to run it many times with small differences, and reconfiguring/regenerating file is a lot of work, but as I said, they all have configs and env variables are convenience overrides most of the time.

> Check out the list above on Dockerhub.

Will take a look, thanks. Wanted to learn docker in depth for a long time, but had no notable project to force me to use it. Maybe someday.

[0]: https://news.ycombinator.com/item?id=26660409

q3k · on April 1, 2021

> I've started with a C64, please. :)

Personal experience in computing is not what I meant. I only realized that I'm not intimately familiar of the dawn of the UNIX daemon and how their configuration methods changed with time, only the echos of this in daily Linux use. Thus, I realized I was possibly extrapolating and assuming things.

> Yes, the tools I've talked about are userspace programs, and are not daemons 99.999% of the time. So you need to run it many times with small differences, and reconfiguring/regenerating file is a lot of work.

Yeah, and I think this lack of distinction is what poisons the discussion surrounding this post - these are separate worlds with different requirements, conflated into a single argument or point of view.

tl;dr I stand by my point with preferring anything over environment variables for services (especially complex ones), but I also fully agree with your usecase for interactive, CLI-driven systems. I mean, one of my favourite programming language features in recent years is that I can cross-compile Go programs just by setting two env vars: GOARCH and GOOS :).

bayindirh · on April 1, 2021

I'd be absolutely horrified if a service that I use need a specific environment variable set in a particular way to work correctly and it's not well documented. That service would get bonus points for inability to configure that particular option in a configuration file.

I personally would never add environment variables to a program I write which may run as a service.

Oh, Java is calling, hold on... :)

emodendroket · on April 1, 2021

Sure, but everyone understands that option and knows the pitfalls. This is kind of the core of the “worse is better” mindset, which somehow seems to have disappeared from the collective consciousness even though Unix is bigger than ever.

hjek · on April 1, 2021

> Attacking something so well established because of personal reasons feels so wrong from my PoV.

That is a mischaracterization of the post. The author is making a clear technical point about how environmental variables are global mutable state. Labeling that as an "attack because of personal reasons" is just plain misleading.

> That thing wouldn't be a de-facto standard if it was too bad, right?

How much of the post did you read? Your point is almost exactly the same as the 3rd listed in the post:

> It's the same old trifecta of why things are bad and broken:

> 1. Envvars are easy to add

> 2. There are existing processes that only work via envvars

> 3. "This is the way we have always done it so it must be correct!"

bayindirh · on April 1, 2021

> That is a mischaracterization of the post.

I don't think so. First of all, as I detailed in [0] and [1], my central point of disagreement is the tone and attitude of the post, not the usage of environment variables itself.

There are a lot of scenarios where environment variables makes a lot of sense, and scenarios where using them is absolute madness as we discussed with q3k in [1].

> How much of the post did you read?

All of it. BTW, please remember asking this question is directly against guidelines [2] (sec: In comments, guideline 8).

> Your point is almost exactly the same as the 3rd listed in the post: 3. "This is the way we have always done it so it must be correct!"

As I said in my other comments, I do not directly support the exact opposite of the author's stance. My disagreement is in the tone and rigidity of viewpoint. To quote myself:

I'm not calling this is good with the persistence of the original author. I tell that it's one of the realities that we have, and instead of burning it with torches, why not build better conventions around it with better attitude and language?

Please see [0] and [1] for further clarification.

[0]: https://news.ycombinator.com/item?id=26660409

[1]: https://news.ycombinator.com/item?id=26660553

[2]: https://news.ycombinator.com/newsguidelines.html

zabzonk · on April 1, 2021

> These methods are being used for decades and well understood with all their advantages and disadvantages.

Just because something has been used for decades does not make it good - for example, avoidable mutable state. And it certainly does not make it well-understood - as a consultant the number of brain frying environment variable configurations I've had to deal with which no permies could tell me anything about defies belief.

> someone comes and tells that it's bad and considered harmful,

Yes, some things are bad and are actively harmful. Famously, unstructured programming using gotos. Would you like to go back to that? Believe me, you would not. But perhaps you are not a programmer?

> That thing wouldn't be a de-facto standard if it was too bad, right?

It's not a "de-facto standard", it's simply bad.

bayindirh · on April 1, 2021

> Just because something has been used for decades does not make it good.

I'm not calling this is good with the persistence of the original author. I tell that it's one of the realities that we have, and instead of burning it with torches, why not build better conventions around it with better attitude and language?

Maybe we can try: "Instead of burying all config under environment variables, why not try doing it like this?", and slowly build something better, step by step. Nothing is inherently good or bad, but can be abused. So the abuse of environment variables as a shortcut needs to stop, one may say and I'd agree, and may also volunteer to help to build a better thing.

But, shunning it with anger and shouting "I'm the one who knows all right things!" sure creates backlash, like here.

All in all, I'm against the attitude, not the idea of improving a situation.

> Yes, some things are bad and are actively harmful.

It might be, but even your solution might not be right. Why the attitude?

> unstructured programming using gotos. Would you like to go back to that?

Did that on some older, limited hardware, and it was fun. It was not OK by today's standards, but I had to. I'll do it again if it's the only thing I can do to work on that particular hardware again.

> But perhaps you are not a programmer?

I just design algorithms and develop scientific applications which run on HPC clusters, nothing fancy.

> It's not a "de-facto standard", it's simply bad.

I didn't say it's good. I say it's a fact. I'm not disagreeing on its bad sides. I'm not OK with the attitude.

zabzonk · on April 1, 2021

> "Instead of burying all config under environment variables, why not try doing it like this?", and slowly build something better, step by step

Use named files, that can be version controlled and documented, but are under direct access from the actual program and can be reported as an error if (for example) they are not found.

bayindirh · on April 1, 2021

Actually, this is how I do:

    1. Make the thing completely configurable with a file.
    2. Add an optional switch to select the location of the config file.
    3. Always ship a well commented, sensible config file with the application. Document environment variables there, if any.
    4. If it makes sense, add the ability generate a default config file if it's missing.
    5. Always add a logger, with good debug and error output (Dovecot is my inspiration and role model there).
    6. Document everything in the code, and in the documentation if possible (external documentation is not my strong part yet. I can't write it fast enough).

sdevonoes · on April 1, 2021

> But, shunning it with anger and shouting "I'm the one who knows all right things!" sure creates backlash, like here.

I think this is relative. I did read the post as well, and I didn't find the author with such a "negative" attitude (but then again, I'm not American).

one2three4 · on April 1, 2021

Agree. Also I don't see any proposal put forth. Did I miss it? The gist I got was "env vars are bad and if we don't use something else (what?) we're hung up in the past".

mrkeen · on April 1, 2021

Easy fix (if it's the shared, mutable state which bugs you):

* Create one class responsible for ingesting env vars at startup.

* Call it from main, and abort early with nice messages if it fails to read something.

* Now you have a nice (preferably immutable) class which guarantees the config is in a 'good state', and is self-documenting because it lists all the keys it uses to lookup env vars with.

zrail · on April 1, 2021

This is essentially what I do in Rails apps. The only reference to an env var is in an initializer that sets an option in the global rails config structure.

marcrosoft · on April 1, 2021

The mutable state can be helpful. It is sometimes helpful to be able to change an app’s config without having to restart it. Ingesting the envs on startup into a class removes this ability.

marcosdumay · on April 1, 2021

Please, never do that on a server.

It's ok for interactive applications, but if you are writing a CLI command (what is different from an interactive CLI application), a system library or a deamon, don't ever let the same application that uses a configuration also change it.

When your non-interactive programs do that and anything at all goes wrong, it's basically impossible to determine the source of the problem. Also, it is common that bugs that one could just avoid triggering by configuration now become unavoidable.

(But if you mean reload the config after getting a SIGHUP or something like that, yeah, this is ok, and the best way to do that is by restarting everything on your program, even if you keep the same process, so your read-once class won't be a problem.)

throwaway10110 · on April 1, 2021

thats quite an antipattern in production.

Immutable config via a config class that can exit early (prefereably startup) if there is a misconfiguration

kristjansson · on April 1, 2021

If The same pattern works well in python at the module level, if your application is setup as a package. A module config.py sets a bunch of python variables like

    import os
    
    ENV_VAR=os.environ.get('ENV_VAR', default_value)

then the rest of the application can grab configuration with

    from .config import ENV_VAR

Since the assignment code executes on import, all config is read in when any piece of it is first used, consistency checks and logging can be written into the config.py module as normal python statements, config values can be cast to appropriate types (raising exceptions if they fail), etc.

FunnyLookinHat · on April 1, 2021

In node.js there are quite a few packages that do exactly this - it's a great pattern and forces you to define which env values you will rely on in one place, rather than dripping them all over your codebase.

trey-jones · on April 1, 2021

This is more or less the way I handle configuration in most applications that I create. +1

david422 · on April 1, 2021

Agreed - this solution works well, and works nicely with statically typed languages.

junon · on April 1, 2021

Oh this guy again. This guy created Meson and has a pattern of being 1) Quite toxic and 2) Entirely dogmatic when it comes to software design. He shows no interest in discussing design problems with Meson and asserts his viewpoints as truth and fact, resorting to snide comebacks instead of having thoughtful conversation.

Doesn't surprise me he wrote an article like this. Completely misguided and isn't rooted in reality.

hjek · on April 1, 2021

Okay, so when you write that this blog author, who made a post arguing how environmental variables are global mutable state, is quite toxic and resorting to snide comebacks instead of having thoughtful conversation, then that is just you engaging in thoughtful conversation about the issue (which is envvars), and not you making a toxic ad-hominem attack at all, right?

retzkek · on April 1, 2021

GP is just one voice in this discussion, where others have already addressed the substance of the article. Some context and history is valuable.

jayd16 · on April 1, 2021

I would say at least the comments about dogma and tone are relevant in the current context.

junon · on April 1, 2021

sdevonoes · on April 1, 2021

It surprises me that people put more attention to the author than to the content. I have no idea who the author is, but his post doesn't seem to me toxic at all nor dogmatic.

On the other hand, your comment sounds a bit toxic, to be honest: "because the author is X it must be that all of his articles are X as well".

silisili · on April 1, 2021

This post seems to ramble without much substance.

The best argument, perhaps only valid argument, is lack of an array type. Easy to work around. The rest seems misguided or ridiculous.

>There is no way to know which one of these is the correct form

What? Of course there is.

The author is also calling it mutable global state, and seems to reference an application becoming confused when ENV isn't set. This reads to me, perhaps incorrectly, that the author doesn't understand that envvars aren't and don't behave like shared global variables. That is, changing one won't affect running applications.

rocqua · on April 1, 2021

To be fair, environment variables are mutable global state across a single shell session.

silisili · on April 1, 2021

Howso? A running process won't affect your current env, and changing your current env doesn't affect the running process.

hnarn · on April 1, 2021

I don't understand how a blog post like this can garner, at time of writing, a hundred upvotes. Normally these types of strong statements "Don't do X/Don't use Y" are attention seeking titles with inversely proportional interestingness of its content. That's the first red flag. The second red flag is that the article is not proof-read: from just the first few minutes of skimming, I found two errors that make the text jarring to read (Persistance->Persistence & [you] can not now -> [you] can not know).

Thirdly, the entire central point makes no sense. The author presents this argument to illustrate why environment variables are confusing:

> The environment is now different. What should the program do? Use the old configuration that had the env var set or the new one where it is not set? Error out? Try to silently merge the different options into one? Something else?

The answer is obvious to anyone that knows what environment variables are and how they work: if the variable is set, it should use it, and if it's not, it shouldn't.

The author goes on (again, spelling error) to state that:

> For comparison using JSON configuration files this entire class os (sic) problems would not exist.

What is the practical difference between using a JSON formatted text file containing settings, and a text file containing environment variables and their definitions? For the situation we're discussing here, the answer is frustratingly simple: there is none. This post is a waste of time.

athenot · on April 1, 2021

Yes. At the same time, the post missed an appropriate case for avoiding environment variables: when you want dynamic configuration for your services, where configuration values can be changed at runtime without requiring a full redeploy of all the instances.

Of course, that brings its own set of complexity which should be carefully weighed against requirements.

Environment variables are still one of the simplest—and yes, most deterministic ways—to alter behavior of a program.

Out_of_Characte · on April 1, 2021

Such a shame, Enviroment variables are indeed difficult to work with sometimes, who sets them? in which file? Who can/will override them? typo's are also not being caught because editors dont have lists of possible variable and/or what value they are allowed to contain. you might also lose them on different containers.

Enviroment variables should be part of cgroups in some way. I dont like that any program can modify the PATH variable as example. seems like a recipe for disaster in privilege escalation.

snicker7 · on April 1, 2021

> who sets them? in which file? Who can/will override them?

The ops team. Environment variables are a great way of separating operational concerns from business logic. Environment variables are great because your application is agnostic about how the configuration is sourced. Let the ops/infra team handle that.

> I dont like that any program can modify the PATH variable as example.

And ... why not? Child processes can't modify the environment of parent processes. Environment variables flow downwards.

tedk-42 · on April 1, 2021

Author doesn't know about https://12factor.net and as another commenter mentioned, probably hasn't deployed something to a 'production' environment (or rather, doesn't know about separation of such environments in the first place).

tdeck · on April 1, 2021

To be fair, "configuration" means a somewhat different thing when we're talking about a user application vs a server. It's easy to forget on HN that some software engineers don't write web servers at all.

Having used the profiling tool TAU (subtle dig), I instantly understood what the author is driving at, and I somewhat agree for many use cases. I shouldn't have to fill my dotfiles with 10 new variables just to use a utility.

mrkeen · on April 1, 2021

I don't know about https://12factor.net either.

But I was assigned a task last year to remove configuration from environment variables (for security reasons). I deployed my work to 'production'.

remram · on April 1, 2021

Linux is usually configured to not allow processes from another user to read /proc/$pid/environ. At least a production machine should be.

Configuration files are resistant to this as you note, but command-line arguments are not (--password=1234 will show up in ps for everyone).

Cederfjard · on April 1, 2021

What security concerns did that alleviate?

mrkeen · on April 1, 2021

Any third party code in our system can just read whatever's in the environment and POST it to some remote server.

zomgwat · on April 1, 2021

Avoiding environment variables reduces the risk but doesn't eliminate it. The secrets still live in memory in some form, correct? However, it does help to eliminate generic attempts to exfiltrate environment variables.

Tight control of egress network traffic is better but more difficult to implement.

kerny · on April 1, 2021

Any third party code can just read your credentials file and POST it to remote server.

yrro · on April 1, 2021

Bold of you to assume my third party code runs with the same UID and SELinux label as my credentials-handling code.

(I wish, it's April 1 after all!)

josephcsible · on April 2, 2021

If the third party code runs with a different UID, then it can't read the environment either.

yrro · on April 3, 2021

Unless it has DAC override or other capabilities. Belt and braces!

josephcsible · on April 3, 2021

If it has DAC override, then it can read your credentials file just as easily as it can the environment.

yrro · on April 3, 2021

Not if SELinux policy prevents it.

yokaze · on April 1, 2021

File permissions allow finer granularity of access control. Environment variables are visible to any user in the system.

loopz · on April 1, 2021

Not in any multi-user multi-process OS. You set environment variables in a process (ie. shell/CMD.EXE) and spawn child process (the program) from that parent. The environment variables will only be visible to those two processes.

c0l0 · on April 1, 2021

Linux disagrees; try

    strings /proc/*/environ

to see for yourself.

On Solaris/SunOS, you could use `pargs -e $PID`. And so on.

Having separate UIDs to run your processes A and B under shields either one from peeking at the other's environment, though. UNIX DAC is simple and powerful enough for MOST security concerns, I would argue.

josephcsible · on April 2, 2021

> Environment variables are visible to any user in the system.

This is completely false in any modern OS. You can only see environment variables of your own processes.

fps_doug · on April 1, 2021

Unset them after right after evaluation.

mrkeen · on April 1, 2021

That's not where the credentials are stored.

kelnos · on April 1, 2021

Well, sure, you shouldn't be putting secrets or other sensitive data in environment variables. But garden-variety configuration is fine to put in env vars. Seems like whoever assigned you this task didn't really know what they were doing.

b0afc375b5 · on April 1, 2021

Oops, I've been putting secrets in environment variables since I can remember. Your comment piqued my curiousity on why this is a bad idea.

Found this:

https://diogomonica.com/2017/03/27/why-you-shouldnt-use-env-...

https://security.stackexchange.com/questions/197784/is-it-un...

rolfvandekrol · on April 1, 2021

It looks like the author is talking about command line tools that use env vars for things that should be arguments. In the comments on the page he admits that for example key credentials are valid usages for env vars.

henvic · on April 1, 2021

Why the trolling? He has valid points. It doesn't matter whether he knows about this methodology or not. Does everything looks like a nail to you?

hjek · on April 1, 2021

> Author [...] probably hasn't deployed something to a 'production' environment

I think you're wrong about that. The blog post author is also author of Meson, the build system.

phtrivier · on April 1, 2021

If you don't mind, I'll keep doing the relatively sane thing: using env variable at the startup of my applications (and, as much as possible, never anywhere else), among other configuration sources (like text files), to create a struct / object / dict / whatever that represents the configuration, and that the rest of my code uses.

If you see someone using `os.env["xxx"]` as an escape hatch for a mutable global variable, then, yes, it's probably not a good idea. But it's not configuration anymore, it's runtime state.

(Although I suppose no one is going to hit HN front page by writing an article titled "Don't use global mutable state", except to make game programmers giggle ?)

josephcsible · on April 1, 2021

"Environment variables is exactly this: mutable global state." No it isn't. Every time you start a process, it gets a set of environment variables of its own, which won't be changed by any further changes in the parent process. This is the opposite of how global variables work and is exactly how function arguments work.

The rest of the article isn't very good either. The examples of running a program with two different states and of trying to do nested escaping would both apply to any means of passing configuration.

I also find it hilarious that the author suggests using JSON configuration files instead, which actually have many of the problems that this article falsely claims that environment variables have.

codemac · on April 1, 2021

> which won't be changed by any further changes in the parent process.

You can attach to a process and call setenv in gdb, so there is a loophole somewhere.

I'm not encouraging this, just pointing out that having a nice binary which reads config, cli, and environment once is still required. Regardless of how you pass that information into your binary.

nikisweeting · on April 1, 2021

The title should be "don't sprinkle environment variable reads all over your codebase", not "don't use environment variables".

The problems proposed are easily fixed by:

1. read all your env vars at startup in a single function, then pass them down from there

2. don't invent your own serialization format, just use json or csv, both work fine in env vars. or use the env var to reference a file path that contains more complex values.

As a devops engineer / sysadmin for going on 10 years now, I pretty strongly disagree with this article. Environment variables are so much better than the alternative.

In the past, programs frequently invented their own configuration loading systems, but over the last few years containerization has strongly nudged most programs towards accepting env vars. The result has been a better, more consistent, and less surprising config experience for everyone, even non-container users.

mcv · on April 1, 2021

Mixed feelings about this. I strongly agree with some, but also feel it's missing the point in a lot of places.

Environment vars should of course not be used to configure specific programs. Having an environment variable to specify the args with which to call a program is needless complication. Just pass it as an argument. Use a configuration file to configure the program.

Environment variables should (only) be used to describe the environment. That should primarily be variables that transcend individual programs. Things like proxy settings are perfect for environment variables. (Well, almost; see below.) Perhaps the location of the configuration file, if that can vary per system (which it probably should be able to; hard-coded locations can also be a problem).

But even then, environment variables can fail. I noticed that some Azure/Kubernetes-related commands on my work Macbook need to run with the proxy on, and others with the proxy off, so I created aliases to enable/disable this environment variable, which completely defeats the purpose of the environment variable. Maybe I should be able to configure this proxy per application after all. Or at least configure whether to use or ignore the proxy settings. And then there are applications that ignore the proxy env var for whatever reason, and require me to configure it specifically for that one program, again defeating the purpose of environment variables (I think npm does this).

But when we deploy our application to different environments, our deployment configuration does set specific env vars so the application knows how to behave in that environment. It's what environment variables are for. But they're not a great fit. For example, we're currently in the process of migrating from AWS to Azure, and some things need to be enabled or disabled there. So we set some environment variable to 'false', except that environment variables are always strings, and in javascript, 'false' evaluates to true. Json configuration might actually make more sense here.

So I don't think we can or should do without environment variables, but I think I agree they're overused, and often used badly.

ben0x539 · on April 1, 2021

Yeah, we have a service that needs to use a proxy for most calls, but one component mustn't use the proxy, in a way that iirc is not properly captured by a `NO_PROXY` entry. Now the abstraction breaks down and the service has some ugly special-case code. :(

klyrs · on April 1, 2021

The old guru I learned Linux from always said: "The bugs I've spent the longest time tracing down have always been caused by environment variables. It was only when I started checking them first, rather than last, that I felt competent as a sysadmin."

Truth is, you aren't going to get rid of them unless you make an operating system that doesn't have them (and good luck porting anything useful to it). This is poor advice, because it amounts to FUD -- those problems won't go away, but in telling people to avoid environment variables will quash their curiosity about them. This teaches people to only look to the environment last.

prussian · on April 1, 2021

I've endlessly debated with myself about this and have come to the conclusion there isn't really a good solution and the best choice is probably put as much in an actual file as possible. environment is sort of leaky and non-obvious . at least with a file there is something that is written somewhere that can be inspected and passed as an argument or as an environment variable. The only real positive I see with environment is that children of the process group basically get it for free, but that can be a negative in it of itself as well.

If I have to pick between environment and arguments as configuration, I'd probably prefer arguments since the application would have to explicitly iterate over all the arguments and handle them in some manner, like assign them to some structure or global internal to the program.

trey-jones · on April 1, 2021

This flies in the face of pretty much every opinion I've heard from experienced developers in the past 5 years. Once someone said, "You should be using ENV for configuration" I started doing it, and I found it to be a better solution than I previously had. I am also open to dedicated config files, whether they set ENV vars or not.

I'm open to the idea that ENV is not the only way, and I certainly believe that there are situations where other solutions are warranted, but my opinion right now is that this is wrong, and I perceive this also to be the prevailing opinion in our industry.

jillesvangurp · on April 1, 2021

Yes. If you package software up for deployment on some server (i.e. you use something like Docker), environment variables are the easiest/only configuration mechanism at your disposal. Packages that don't support this, need some workarounds (e.g. dockerize to template some config file using environment variables) to be packaged up; which is annoying and extra work. Decent server software comes prepackaged in docker form these days. Which means environment variables are the way you control those unless you want to force your users to create their own docker containers just so they can fiddle with config files, which is a bit user hostile.

Dedicated configuration files only make sense if you assume a writable file system is there. Which is a broken assumption on many containerized environments. There is a lot of legacy software that works that way of course. Some software allows doing configuration via config files and then allows overriding keys in those files with some naming convention via environment variables. That's a good compromise since that allows you to package up sane defaults that you override as needed via the environment. It does not have to be an either or type thing.

Another common pattern is to allow overriding configuration via commandline arguments, which you can then gather in an environment variable and inject via docker. I do this a lot with JVM based software where we have a lot of -D options to override specific configuration value defaults via docker. Less clean than just having dedicated environment variables in the docker file but it works.

q3k · on April 1, 2021

> Dedicated configuration files only make sense if you assume a writable file system is there.

I disagree. It's pretty typical in containerized environments (in my experience) to pass in (eg. mount) a config generated by whatever configuration management system into a container for it to load its configuration from.

This has the following advantages over env var configs:

- support for more complex expressions than string -> string maps (eg., configuring an IP blocklist)

- less chance of mistakes stemming from typos (eg., 'FOO_LODGIR' instead of a 'FOO_LOGDIR' in an environment variable will likely be silently ignored by a service, while a 'lodgir' key in a config file will cause an error in most serious config parsers that I've seen)

- working against a schema - if you use something like openapi, json-schema or protobuf/prototext to define your config format, you can use this schema to check/generate the config from other code, and even use it as an automatic source of documentation for the configuration format

- hot reloads of configs - once started with env vars, the env vars cannot be (easily) changed, while files can easily be changed (on mutable FS, or from an external source like ConfigMaps/Secrets in k8s), watched and reloaded from, or even used as a signal that the software should restart

jillesvangurp · on April 1, 2021

Writable file systems are needed for things that are stateful but inappropriate otherwise.

emodendroket · on April 1, 2021

Docker certainly supports mounting a configuration file on the host system. Might be a little messier but it’s not impossible.

nyanpasu64 · on April 1, 2021

IMO, the author is writing from the perspective of Meson, a command line build tool used by individuals that takes arguments and caches them in a per-project file, whereas most of the negative replies are commenting from the perspective of sysadmins deploying software into homogeneous servers or Docker containers. Would make be a better program if MAKEFLAGS was not an environment variable? (IDK.) Would Git be a better program if the project directory was passed as an argument rather than as inherited state (the cwd)? (IDK.) Would less or nano be a better program if file paths were passed in as environment variables rather than arguments? (no.)

k_ · on April 1, 2021

> Would Git be a better program if the project directory was passed as an argument rather than as inherited state (the cwd)? (IDK.)

Well, it can do both :) `--git-dir=<path>` will let you run git for another project directory

Aeolun · on April 1, 2021

> a command line build tool used by individuals that takes arguments and caches them in a per-project file

That is literally one of the worst ideas I’ve ever heard of. And I’ve heard many bad ideas recently.

maccard · on April 1, 2021

What's the alternative? That's how make, cmake, msbuild, ninja,docker, and basically every other build tool I'm aware of works

nyanpasu64 · on April 1, 2021

CMake does that and is a truly demented implementation, mixing user-specified initial state (supplied the first time you run CMake and carried on to future argument-less invocations), derived state, and cached compiler locations and versions in a single CMakeCache.txt file. Edit CMakeLists.txt? Time to delete CMakeCache.txt! Upgrade your compiler? Time to delete CMakeCache.txt!

I haven't used Meson all that much, but I recall it's a bit better than CMake but I still ran into a similar issue at one point.

Aeolun · on April 1, 2021

You take the arguments every time? Or you read them from a configuration file.

But you do not automatically add them to the configuration file unless the user explicitly tells you to do so.

nikisweeting · on April 1, 2021

Docker doesn't do that. You can create a .env file, but it only reads the options you give it, Docker never writes/caches to that file.

haolez · on April 1, 2021

Make doesn't work like that.

orthecreedence · on April 1, 2021

Wait, make has a global state? I was under the impression most make targets are stored somewhere in the project itself (like a build/ directory). I think this is a fine approach, BTW. I think global state should be avoided unless necessary.

haolez · on April 1, 2021

Make does what you tell it to do. While some projects have the setup that you've described, many others do not. Make will only check the timestamps of the generated assets and decide to build (or not) based on that.

j1elo · on April 1, 2021

I could agree with "don't only use environment variables for configuration", but at the same time, by all means do use environment variables to allow for easy overriding of configuration.

I cannot but think of the large number of times that being able to quickly override some parameter with an env var has helped me achieve something which was not exactly intended by the original author.

Also, env vars are the most common way to override configuration of software that has been dockerized (prepared to run in a Docker container). Plus, containers remember their initial environment, so there is no issue of running afterwards without the env var.

RBerenguel · on April 1, 2021

Changing a variable of an existing container without recreating or re-running the docker run command is tricky, though. With a configuration file mapped to an external folder it's "just" modifying the file and docker restart, with environment variables it's somewhat more convoluted. I don't think it is as clear-cut as the title suggest, each kind of configuration has its place.

j1elo · on April 1, 2021

Yes, that was my point actually, although I might have expressed it poorly:

The article says that one problem of using env vars is that you might set it for one run, but then forget to set it for the next one. This could be problematic if the program is stateful. So I wanted to make the passing comment, while talking about containers, that a container will "remember" the initially set env vars, so if you stop and start the process, the variables would persist across executions.

RBerenguel · on April 1, 2021

Indeed, that's actually an awesome feature because guarantees the container will work as it was running upon a restart.

Problem is when you have an environment variable you want to change for some reason without triggering a redeployment/release/whatever, with a configuration file in a volume you can change it and restart (of course this breaks the premise of "restart keeps full state", but I can live with file modifications, with docker at least you can keep them at least "triggered externally"). With an environment variable the only option I have found is stopping docker, digging into the container configuration, changing said variable and starting docker again (lovingly called "Indy swap" when we've had to do it, luckily it's less than once a year).

jpgvm · on April 1, 2021

You don't need to modify the Dockerfile to change environment variables at runtime.

Also if you specifically want to modify a file and re-run the container look at --env-file arg to docker run.

RBerenguel · on April 1, 2021

That's not what I'm saying, what you mention can be done indeed.

What I say is that if you are in a machine with an already running docker container, one you don't have the corresponding `run` command of (i.e. you _can't docker run_ for whatever reason), changing the environment variables of that container is tricky.

nickjj · on April 1, 2021

You could always run the `env` command inside of the container to get all of the set environment variables. Then you can define, reconstruct and run your container with whatever args you want.

But yeah the env_file approach is the way to go here. I've been using Docker in production since 2015 and never ran into your use case. I always had an .env file ready to go that was loaded in with env_file and always had control over being able to run the container with whatever command I see fit.

And in cases where I have no control over how things are run (like Heroku), environment variables still work because a ton of hosting platforms expect you to set and read them for various configuration.

And you can also commit an example env file to git with no secrets so developers can `cp .env.example .env` to get going in 1 second.

The environment variable pattern is incredibly standard in the world of web dev.

RBerenguel · on April 1, 2021

I have been using Docker in production since 2015 as well, and I've had to do this more than once. To each their own.

dustinmoris · on April 1, 2021

> Environment variables is exactly this: mutable global state. Envvars have some legitimate usages (such as enabling debug logging) but they should never, ever be used for configuring core functionality of programs

Doesn’t matter where configuration lives, the name itself suggests that it is by definition global mutable state, I mean that’s the whole point of it.

And it doesn’t matter if it’s used to configure core functionality (e.g feature toggles) or secondary functionality. The whole purpose is to do exactly that :)

aarbor989 · on April 1, 2021

Maybe I live under a rock, but I feel like command line utilities taking environment variable arguments just isn’t that common of a practice? The only examples I can think of are when an argument is more of a “global variable” that many different commands may find useful such as JAVA_HOME, GITHUB_TOKEN, etc. For web applications and docker containers, environment variables fit easily into the deployment process and for all intents and purposes, are pretty much static.

coldtea · on April 1, 2021

>Maybe I live under a rock, but I feel like command line utilities taking environment variable arguments just isn’t that common of a practice?

Nope, it's quite common. Many standard UNIX cli utilities take some environment variables (including things like LOCALE). Also common in scripts, setting up programming languages paths, etc.

It's also not a bad practice, contrary to what the author says.

saurik · on April 1, 2021

LS_COLORS and GREP_COLORS are weirdly-specific... but arguably aren't!

mfontani · on April 1, 2021

Examples I've used in the past few days, from memory - along the lines of:

LC_ALL=C foo

PAGER=cat man ls

TZ=UTC date

aarbor989 · on April 1, 2021

These are good examples but I think these fit under my “global argument” umbrella. Imagine if every Linux command had a different argument name for the pager?

kristjansson · on April 1, 2021

One perspective from this thread: environment variables end up denoting different levels of persistence in interactive, server, and containerized applications. For interactive apps, environment variable configs tend to sprout up for _persistent_ configurations e.g. HOMEBREW_NO_AUTO_UPDATE - we want the behavior to change, but we don't want to pass a flag to make that change every time.

For server-like applications, env vars denote a _transient_ change in behavior e.g. FLASK_DEBUG=1 python -m flask ... to turn on different behavior in that instance of the application. Persistent configuration changes go to a config file or similar.

For containerized applications, env vars are back to denoting _persistent_ changes in configuration since we bake the values in to deployed containers via whatever orchestrator.

TFA seems to be assailing the first perspective, which is actually reasonable. Sticking secrets and configuration into the shell environment for each tool is not great. Transient config via args and persistent config via files makes lots of sense.

Even in that setting though, there is probably a role for environment variables when spooky-action-at-a-distance changes are required in sub (sub- (sub- ...)) processes / libraries where passing configuration through each caller would be a pain.

sedeki · on April 1, 2021

I hope I don't get downvoted, but I think the OP has a point. Command line arguments will do, and perhaps even be better.

Why are there so many harsh comments about this post?

zokier · on April 1, 2021

In typical Linux setups, cmdline is world-readable, but environ is not. So you never should put secrets in cmdline, but they are ok to be in environ. And that is pretty much the only difference between the two.

icedchai · on April 1, 2021

The environment is inherited by child processes. I think that is a very important difference.