Hacker News new | past | comments | ask | show | jobs | submit login

The moment you need control flow to define your resources, I'd argue that you're verging away from the realm of declarative infrastructure.

I'm using Terraform to manage 10^4 machines in combination with sane CI/CD, Bash/JQ (for dealing with Terraform outputs), Packer and Ansible. Everytime I see somebody reaching to a full programming language to define their infrastructure, they seem to be doing too much with one tool.

Terraform should merely provision things and in that role I find it fine as is. Preferred, even.




It’s not flow control so much as sane, expression-based generation of resources.Terraform has been evolving toward the dynamic with features like for_each, but these features have awful ergonomics compared to something like list comprehensions. Similarly, sometimes you want to reuse some pattern, but the unit for reuse in terraform is the modules which involves a lot of ceremony, so you don’t reach for it as often as you might if the unit of reuse was a simple function definition.

I don’t especially care if Terraform is a simple static language for generating resources and the DRY-ness comes from an exterior Python/etc script that generates the Terraform files or whether the requisite dynamism is built into the terraform language itself, but make no mistake, the dynamism is absolutely essential for maintainable terraform.


I really dislike the idea of declarative infrastructure. It's literally a program that is designed to do one thing, but will change a million things in order to do that one thing. It's Configuration Management for Infrastructure. Yet so many people have this idea that it's something else, like it's supposedly simpler or more effective.

Saying I want an S3 bucket named "foo" is the same as

  aws s3 list-buckets | grep "foo" || aws s3 create-bucket --bucket "foo"
Did I need a big fat declarative infrastructure system to make that? No. But people want more complexity and features, and they want to make it look simple. So they write big fat applications and libraries to do that. The idea that there's some inherently superior difference of "declarative programming" over "regular programming" is giving people the idea that wrapping everything up in interfaces somehow removes the complexity, or somehow ends up in a better program.


This example is really simple -- it gets more complicated when you want to check things that don't serialize perfectly to strings you can easily grep for.

Once you start writing complex scripts you have a choice -- you either do it imperatively, or declaratively. Eventually you'd come to the fact that it doesn't make sense to just... run imperative commands when you can't guarantee that the other end is idempotent, so you'd arrive at:

- (optionally) take a lock on performing changes

- Check existing state

- Perform your changes based on existing state

- (optionally) release your change lock

And voila, we're at complexity. I'd argue that this complexity is essential, not accidental given the goal of making an easy to use system that ensures state.


> Once you start writing complex scripts you have a choice -- you either do it imperatively, or declaratively.

I don't think declarative programming exists. I think it's just a regular old program with a poorly defined interface. Moreover, I think the claims of idempotence are overblown to the point of near falsehood.

Declarative Infrastructure is really just Configuration Management applied to cloud infrastructure rather than operating system software. Neither have really solved anything, other than turning the management of complexity into a Sisyphean task. Forever pushing drifting state back up the hill.

Compare this to Immutable Infrastructure, where state never drifts. One never "fixes" a container once deployed, or a package once built and installed. One merely rolls back or upgrades. Any uncertainty is resolved in the build and test process, and in providing both all the dependencies and the execution environment.

I think eventually people will wise up to the fact that Terraform is just puppet for infrastructure. I think the real fix is to make the infrastructure less like an operating system and more like versioned packages. Install everything all at once. If anything changes, reinstall everything. Never allow state change.


If you’re careful you can do blue/green failover to replace some resources, but datastores need to be updated in place.


Sounds like we need to reinvent data stores!

SQL is due for replacement. The combination of schema and data in one constantly mutating hodge podge with no atomic immutable versioning or rollback is absolutely ancient. Migrations are an okay hack but definitely not good enough.

ZFS and LVM prove filesystems can do snapshots and restores on a version of filesystem history without a lot of pain, so clearly we just need more work here to make it an everyday thing. Versioning should be the default and probably also an infinite transaction log, seeing as capacity and performance is ridiculous now.

And couldn't we lock writes, revert a perpetual write journal/transaction log to some previous version, and then continue a new write history tree? If you run out of space, overwrite the old history. If you don't run out of space, allow reverting back.

And allow bulk atomic updates by specifying a method to write files that aren't 'live' until you perform some ioctl, and then atomically expose them and receive the new history version. Then you could do immutable version-controlled storage on a filesystem, right?

Blob/object stores should be much simpler to do the same with. Just an API rather than ioctl.

In this way, replacing a data store immutably will just be replacing a reference to a storage version, the same as using snapshots, but built into the filesystem/API.


Hm, isn’t there still nondeterminism from a path dependency, because reinstalling a datastore that has arbitrary history isn’t exactly equivalent to creating a datastore with none?


> I don't think declarative programming exists. I think it's just a regular old program with a poorly defined interface. Moreover, I think the claims of idempotence are overblown to the point of near falsehood.

I think this really depends on how you define the term "declarative programming" -- pinning down a singular meaning and a singular interpretation is really hard. If we think about it like a spectrum, there's a clear difference between ansible and terraform like there is with python and prolog. That's "declarative" enough for me.

Idempotence is also really tricky and hard -- I'm not surprised most large codebases can't handle it, but getting close is definitely worth something.

> Declarative Infrastructure is really just Configuration Management applied to cloud infrastructure rather than operating system software. Neither have really solved anything, other than turning the management of complexity into a Sisyphean task. Forever pushing drifting state back up the hill.

While I agree on declarative infrastructure being configuration management applied to cloud infra (especially in the literal sense), I would argue that they have solved things. In the 90% case they're just what the doctor ordered when compared to writing every ansible script yourself (or letting someone on ansible universe give it to you) -- and ansible actually supports provisioning! The thing with this declarative infrastructure push is that it's encouraged the companies themselves to maintain providers (with or without the help of zealous open source committers), so now someone else is writing your ansible script and it has a much better chance of staying up to date.

> Compare this to Immutable Infrastructure, where state never drifts. One never "fixes" a container once deployed, or a package once built and installed. One merely rolls back or upgrades. Any uncertainty is resolved in the build and test process, and in providing both all the dependencies and the execution environment.

People are often using these two concepts in tandem -- the benefits of immutable infrastructure are well known, and I'd argue that declarative infrastructure tools make this easier to pull off not harder (again, because you don't have to write/maintain the script that puts your deb/rpm/vm image/whatever on the right cloud-thing).

> I think eventually people will wise up to the fact that Terraform is just puppet for infrastructure. I think the real fix is to make the infrastructure less like an operating system and more like versioned packages. Install everything all at once. If anything changes, reinstall everything. Never allow state change.

Agreed, but I'm not sure this is very practical, and there's a lot of value in going part of the way. There is a lot of complexity hidden in "reinstall everything" and "never allow state change", and getting that going without downtime -- it requires the cooperation of the systems involved most of the time, and you'll never get away from the fact that there is efficiency lost.

But again, we were talking about the scripts you'll have to write -- in a world that is not yet ready for fully immutable infrastructure, it's just a question of how you write the scripts, not whether an option exists that will prevent you from writing them all together (because there isn't, and most things are not fully immutable-ready yet).


> there's a clear difference between ansible and terraform

The only difference I can see is that Terraform attempts more of an estimation of what might happen when you apply. Otherwise they're the same.

Terraform has multiple layers of unnecessary complexity which were added with good intention (the belief that you could "plan" changes before applying them) but don't actually work in practice. Your state file never reflects the actual state, so it's pretty much meaningless. The plan step is (in theory) supposed to tell you what will happen before you hit apply. But actually knowing it beforehand is impossible.

Part of that is the fault of the providers that don't do the same validation as the actual APIs you're calling do. But the other part is the fact that the system is mutable; it's always changing, so you can never know what will happen until you pull the trigger. The only way to say "only apply these changes if they will actually work" is to move the logic into the system, turning them into transactions (ala SQL).

Honestly, the only reason I use Terraform at all is because writing a bunch of scripts is not scalable. With large teams, you have to use some kind of uniform library/tooling to manage changes. Terraform is currently the best "free" option for that, but I don't find Ansible any more or less reliable, it's just more annoying to use. I definitely don't use them for any "declarative" approach they may have. And in fact, for regular app deployments, I actively do not use Terraform/Ansible at all, and instead write deployment scripts that can manage my particular deployment model requirements. I intentionally abandon the "declarative" model because it's so uncertain (and unwieldy).

> The thing with this declarative infrastructure push is that it's encouraged the companies themselves to maintain providers (with or without the help of zealous open source committers), so now someone else is writing your ansible script and it has a much better chance of staying up to date.

I agree with you here, it's very good that companies can invest in supporting a provider so people can benefit from common solutions. I'm not sure that is specific to declarative infrastructure as much as just being more proactive about supporting their users using their services, though. For example, NewRelic didn't have a Terraform provider until one of their customers wrote one, and eventually they took it over. It's still not great (I have to supplement a lot of missing features with custom scripts calling their APIs directly), but it's better than nothing.


Infrastructure should be defined in an easily digestible, human-readable format.

Your manifests serve two purposes: define infrastructure and self document.

While you can achieve the same infrastructure automation with shell scripts, they’re rarely written well enough to easily understand, introducing operational risk when handed off to other people or teams.


Documentation needs to express the intent of the author and how they arrived at a solution, and more importantly why they arrived at a solution. As someone who's had to clean up "self documented" code I can say unequivocally it will be a disaster. A decade from now we will be untangling thousands of lines of some ancient Python library to understand the intent of infrastructure that could have otherwise been properly documented in 5 minutes.


Yes but AWS CLI commands change over time and don't have a native way of maintaining which version of the CLI you use. Also, you have to maintain that knowledge for however many things you have to do across however many providers.

The point of Terraform isn't to add complexity, it's to have a general way of interacting with a vast number of APIs that's effectively the same and to abstract away the tribal knowledge of knowing how each individual API works.

On the same provider version, you generally can expect Terraform to work the same over time (okay this is less true for say Google provider...) as the CLI keeps evolving.

It's still helpful to understand the providers and their CLIs, but Terraform is a substantial force multipler because of how generic it is across the absurdly long list of APIs that it talks to. That is what its value is.


But it's not generic. I have to track the provider version, the Terraform version, my module version, and any sub-module versions, long-term. Each internal team has to jump through hoop after hoop just to run terraform apply reliably.

I've never yet had to rewrite a shell script that used a new version of AWS CLI. It's very possible that that's only because I've not been using it enough. But even that would be just one level of complexity to manage, rather than four.

And in fact, even within a single provider, interfaces aren't generic at all.

You have to write every single resource of every single provider to be specific to its definition. It would be the same amount of work if you were writing a shell script with curl to plug into each API call. I know, because it was actually faster for me to write a Bash implementation of NewRelic's APIs than to figure out its ridiculous Terraform provider and resources with no documentation.

The only benefit of Terraform is that I don't have to write the API implementation [for most providers]


> The moment you need control flow to define your resources, I'd argue that you're verging away from the realm of declarative infrastructure.

Declarative infrastructure shouldn't be pursued for it's own sake -- what I want is efficient, and simple to manage infrastructure automation. The declarative nature is awesome, but once you start doing plumbing of variables and complexity from one static script from another, the cognitive load of keeping this all in line is better managed with a programming language in my opinion, you're just choosing bash/jq/awk/etc instead of a different language.

I think "the way the declarations are made must be static files" is dogmatic or at least limiting for me. Yes it is absolutely the simplest way to view what's present, but the problem is when someone goes into change any of this they will be dealing with your bolted-together complexity (even if it's not very complex).

> I'm using Terraform to manage 10^4 machines in combination with sane CI/CD, Bash/JQ (for dealing with Terraform outputs), Packer and Ansible. Everytime I see somebody reaching to a full programming language to define their infrastructure, they seem to be doing too much with one tool.

> Terraform should merely provision things and in that role I find it fine as is. Preferred, even.

I can't argue with the efficiency and efficacy of your setup, but I don't think much of this has to do with what we were discussing -- Pulumi does not seek to do the jobs of those other tools -- it's not going to build your VM images or do provisioning (unless you use it that way like with terraform[0]).

Here's a concrete example of a benefit I got form using Pulumi over terraform recently, in some code working with SES:

    import * as fs from "fs";

    // ... more imports and other lines

    // Email Access key
    const emailAccessKey = new aws.iam.AccessKey(
      `${stack}-ses-access-key`,
      {user: emailUser.name}
    );

    export const emailUserSMTPPassword = emailAccessKey.sesSmtpPasswordV4;
    export const emailUserSecret = emailAccessKey.encryptedSecret;

    // Write the smtp username and password out to a local secret file
    const apiSecretsDir = path.join(__dirname, "secrets", "api", stack);
    const smtpUsernameFilePath = path.resolve(path.join(apiSecretsDir, "SES_USERNAME.secret"));
    const smtpPasswordFilePath = path.resolve(path.join(apiSecretsDir, "SES_PASSWORD.secret"));

    emailAccessKey.sesSmtpPasswordV4.apply(password => {
      console.log(`Writing SES SMTP username to [${smtpUsernameFilePath}]`);
      fs.writeFileSync(smtpUsernameFilePath, emailUsername);

      console.log(`Writing SES SMTP password to [${smtpPasswordFilePath}]`);
      fs.writeFileSync(smtpPasswordFilePath, password);
    });
I wanted to write information out to a file... So I just did, and that was it. No need to reach for the stack output later and pipe it anywhere -- any time pulumi runs it will update that variable if/when it changes, and the next tool (which requires the file at that path to be present) will continue on without knowing a thing.

I can't say that this is perfect Pulumi code (ex. I could have defined a custom Resource to do this for me), but I have saved myself having to do the plumbing with bash scripts and terraform output awk-ing, and the information goes just where I want it (NOTE: the secrets folder is encrypted with git-crypt[1]). When someone comes to this file (ses.ts), they're going to be able to easily trace where these values where generated -- similar with bash scripts, but now they don't have to be a bash/awk/jq master to manipulate information. There are definitely some gotchas to using Pulumi (like the `.apply` there), but in the end, I'd prefer to make changes like this in a consistent language I like (Typescript).

My toolkit looks very similar to you, except I basically only use make + kubectl + pulumi + ansible (rarely, because of the kind of servers I rent).

[0]: https://www.terraform.io/docs/provisioners/

[1]: https://www.agwa.name/projects/git-crypt/


You can just write information out to files in Terraform with no stress.

In terraform resources this is what that looks like:

    resource aws_iam_user email {
      name = "email"
    }

    resource aws_iam_access_key email {
      user = aws_iam_user.email.name
    }

    resource local_file smtp_password {
      content = aws_iam_access_key.email.ses_smtp_password_v4
      filename = "SES_PASSWORD.secret"
    }
So what's the plumbing that you would have to do? Under the hood, Pulumi is using the Terraform providers...

(I left out username, because I don't see where you're setting emailUsername)

In my pipelines I don't bother writing out to files things that are in terraform state. I just create an output for that state (potentially set to sensitive) and then use that output in my CI/CD. Remote state stays encrypted and without wide access and I don't have to worry about secrets being in files anywhere.

That's where the bash scripts do things with outputs. It could by python or whatever, it doesn't matter really. But with bash I can easily just set variables to `terraform output -json | jq <select output &/|| do stuff>`.

Mainly all I do is write terraform outputs to vault (i have simple bash automation to do all of this) and then I can use the Vault secrets in other CI/CD pipelines.


> You can just write information out to files in Terraform with no stress.

This wasn't the point -- it was that I wanted to do something that I know how to do in a fully featured programming language, and I can "just do it". Writing local files is a very simple example -- unless you're arguing that terraform's capabilities amount to an entire language's ecosystem, this is just the tip of the iceberg.

> So what's the plumbing that you would have to do? Under the hood, Pulumi is using the Terraform providers...

I think I didn't explain it well enough. Pulumi and Terraform are almost the same tool, but the difference is how the pieces are plumbed together. I prefer plumbing with a programming language more than shell scripts, utilities and/or some other programming languages.

Also, Pulumi's system of custom resources which are just pieces of code sitting in your codebase is fantastic and novel (terraform has custom providers but this feels significantly more heavy weight).

> In my pipelines I don't bother writing out to files things that are in terraform state. I just create an output for that state (potentially set to sensitive) and then use that output in my CI/CD. Remote state stays encrypted and without wide access and I don't have to worry about secrets being in files anywhere.

If you're really adjusted to Terraform, and it works great for you, then awesome -- I'm not out to change how you do things. It sounds like you've fully bought in to the terraform way of doing things, and it's working for you, and that's great.

> That's where the bash scripts do things with outputs. It could by python or whatever, it doesn't matter really. But with bash I can easily just set variables to `terraform output -json | jq <select output &/|| do stuff>`.

Here it is again... My point is that the plumbing matters, and a fully baked programming language offers the possibility of better plumbing. I didn't touch on it much, but just having access to custom resources with Pulumi might be able to cut down the external plumbing to zero, and enable creating more reusable pieces.


> Also, Pulumi's system of custom resources which are just pieces of code sitting in your codebase is fantastic and novel (terraform has custom providers but this feels significantly more heavy weight).

Pulumi literally uses Terraform's custom providers as its dependencies under the hood to make this work.

Moreover, you're entrusting your entire production stack to an ultra-aggressive, hypergrowth, early stage startup...


Your statement about providers as dependencies is inaccurate in a number of respects:

- Pulumi has the ability to use the CRUD logic from Terraform providers to reify resources, but that is one of a number of different approaches it can use.

The Kubernetes new Azure Resource Manager providers are instead built via the API specs and additional annotations, and involve no aspects of Terraform.

- (Less importantly, but still:) Terraform does not have the notion of “custom” providers as a technical construct - first- and third-party providers implement the same protocol and have the same capabilities. There are just “providers”.

Disclaimer: I have contributed to Pulumi and now use it day to day at a large company. I also was a core maintainer of Terraform at HashiCorp for several years.


> Pulumi literally uses Terraform's custom providers as its dependencies under the hood to make this work.

Right but you can see that the interface is easier as a custom resource though right? The Pulumi docs on making a custom resource are just extending a class. Making a good one I'm sure is fraught with peril but it's so much easier than trying to extend terraform as far as I can see, I'm glad that Pulumi has done this for me.

> Moreover, you're entrusting your entire production stack to an ultra-aggressive, hypergrowth, early stage startup...

Source code is Apache 2.0 and available[0]... I'm not against them trying to profit, but they've created a useful thing that is licensed very permissively (they could have gone with BSL or whatever else)...

[0]: https://github.com/pulumi/pulumi




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: