Why not consider nVME in this case then as cheaper than RAM, slower than RAM, bu...

mikepurvis · on Nov 26, 2020

I think a lot of it depends what the machines are used for. I'm not actually the IT department, but I believe in my org, we started out with a SAN-backed high availability cluster, because the immediate initial need was getting basic infrastructure (wiki, source control, etc) off of a dedicated machine that was a single point of failure.

But then down the road a different set of hosts were brought online that had fast local storage, and those were used for short term, throwaway environments like Jenkins builders, where performance was far more important than redundancy.

tonyarkles · on Nov 26, 2020

I’m laughing a little bit because an old place I used to work had a similar setup. The SAN/NAS/whatever it was was pretty slow, provisioning VMs was slow, and as much as we argued that we didn’t need redundancy for a lot of our VMs (they were semi-disposable), the IT department refused to give us a fast non-redundant machine.

And then one day the SAN blew up. Some kind of highly unlikely situation where more disks failed in a 24h period than it could handle, and we lost the entire array. Most of the stuff was available on tapes, but rebuilding the whole thing resulted in a significant period of downtime for everyone.

It ended up being a huge win for my team, since we had been in the process of setting up Ansible scripts to provision our whole system. We grabbed an old machine and had our stuff back up and running in about 20 minutes, while everyone else was manually reinstalling and reconfiguring their stuff for days.

mikepurvis · on Nov 26, 2020

Ha, that's awesome. Yeah, for the limited amount of stuff I maintain, I really like the simple, single-file Ansible script— install these handful of packages, insert this config file, set up this systemd service, and you're done. I know it's a lot harder for larger, more complicated systems where there's a lot of configuration state that they're maintaining internal to themselves and they want you to be setting up in-band using a web gui.

I experienced this recently trying to get a Sentry 10 cluster going— it's now this giant docker-compose setup with like 20 containers, and so I'm like "perfect, I'll insert all the config in my container deployment setup and then this will be trivially reproducible." Nope, turns out the particular microservice that I was trying to configure only uses its dedicated config file for standalone/debugging purposes; when it's being launched as part of the master system, everything is passed in at runtime and can only be set up from within the main app. Le sigh.

ithkuil · on Nov 26, 2020

Lol. Hope may not be a strategy, but you need a strategy if you don't want to really learn that lession the hard way: if it hurts do it more often.

Spooky23 · on Nov 27, 2020

It depends on what you want to spend your labor dollars on. I run a complex system on HPE storage which is all SSD or NVME and is super fast. You pay, but the vendor mostly takes care of it operationally — I pay for about 20% of a SAN sme, mostly to do maintenance on the storage fabric.