CrowdStrike accepting the PwnieAwards for "most epic fail" at defcon

chongli · 2024-08-11T16:41:42.000000Z

I appreciate that we’re finding the humour in this catastrophe but what about the question of liability? I have seen a few stories on HN of the billions lost by this event but so far not much in the way of lawsuits.

What is the situation? Are the licenses so ironclad that customers have no recourse? I could understand this in the case of consumers who might suffer minor inconvenience as their home PC is out of service for a few hours/days but it seems totally unacceptable for industries to accept this level of risk exposure.

This is one of the big reasons civil engineering is considered such a serious discipline. If a bridge collapses, there’s not only financial liability but the potential for criminal liability as well. Civil engineering students have it drilled into their heads that if they behave unethically or otherwise take unacceptable risks as an engineer they face jail time for it. Is there any path for software engineers to reach this level of accountability and norms of good practice?

crote · 2024-08-11T18:38:18.000000Z

> Civil engineering students have it drilled into their heads that if they behave unethically or otherwise take unacceptable risks as an engineer they face jail time for it. Is there any path for software engineers to reach this level of accountability and norms of good practice?

The problem is that with civil engineering you're designing a physical product. Nothing is ever designed to its absolute limit, and everything is built with a healthy safety margin. You calculate a bridge to carry bumper-to-bumper freight traffic, during a hurricane, when an earthquake hits - and then add 20%. Not entirely sure about whether a beam can handle it? Just size it up! Suddenly it's a lot less critical for your calculations to be exactly accurate - if you're off by 0.5% it just doesn't matter. You made a typo on the design documents? The builder will ask for clarification if you're trying to fit a 150ft beam into a 15.0ft gap. This means a bridge collapse is pretty much guaranteed to be the result of gross negligence.

Contrast that to programming. A single "<" instead of "<=" could be the difference between totally fine and billions of dollars of damages. There isn't a single programmer on Earth who could write a 100% bug-free application of nontrivial complexity. Even the seL4 microkernel - whose whole unique selling point is the fact that it has a formal correctness proof - contains bugs! Compilers and proof checkers aren't going to complain if you ask them to do something which is obviously the wrong thing but technically possible. No sane person would accept essentially unlimited liability over even the smallest mistakes.

If we want software engineers to have accountability, we first have to find a way to separate innocent run-of-the-mill mistakes from gross negligence - and that's going to be extremely hard to formalize.

ang_cire · 2024-08-11T19:00:59.000000Z

To add onto this, the Pwnie Awards also go to people who get attacked, which is something that e.g. civil engineers certainly don't get blamed for (i.e. if a terrorist blows up their bridge).

We would need a way to draw a liability line between an incident that involves a 3rd party attack, and one that doesn't, but things like SolarWinds even blur that line where there was blame on both sides. When does something become negligence, versus just the normal patching backlog that absolutely exists in every company?

And why are people aiming the gun already at software engineers, rather than management or Product Architects? SE's are the construction workers at the bridge site. Architects and Management are responsible for making, reviewing, and approving design choices. If they're trying to shift that responsibility to SEs by not doing e.g. SCA or code reviews, that's them trying to avoid liability.

Honestly, this reaction by the CEO is great for taking responsibility. Even if there's not legal liability, a lot of companies are still going to ditch CrowdStrike.

yowzadave · 2024-08-11T23:54:26.000000Z

> the Pwnie Awards also go to people who get attacked, which is something that e.g. civil engineers certainly don't get blamed for

To be clear, this incident was not due to an attack—CrowdStrike just shot themselves in the foot with a bad update.

edanm · 2024-08-12T06:45:39.000000Z

True, but the reason CrowdStrike has code running in a manner that is capable of bringing down the system, and the reason they push out updates all the time, is because they are in general combating attackers.

If there were no attacks, you wouldn't need such defensive measures, meaning the likelihood of a mistake causing this kind of damage would be almost nothing.

gopher_space · 2024-08-11T19:31:38.000000Z

The trade is already a constant struggle with management over cutting corners and short term thinking. I’m not about to be blamed for that situation.

tiahura · 2024-08-12T03:21:22.000000Z

Do you think the situation for real engineers is different?

gopher_space · 2024-08-12T04:20:07.000000Z

Look, I wasn't expecting anyone to thank me for my service when I went back to school for COBOL and saved all of your paychecks circa '97 - '99, but I'm not going to sit here and be compared to those bucket-toting girder jockeys.

yodelshady · 2024-08-12T07:46:16.000000Z

Yes. Because whilst the same pressures exist, there's a short number of engineers licensed to actually sign off on a project, and they're not going to jeopardise that license for you.

account42 · 2024-08-12T12:35:11.000000Z

Sounds like a case of real consequences for engineers working out well.

HideousKojima · 2024-08-12T13:08:20.000000Z

Only if you ignore downsides like drastically increased costs for most civil engineering projects.

NobodyNada · 2024-08-11T20:23:07.000000Z

> people who get attacked, which is something that e.g. civil engineers certainly don't get blamed for (i.e. if a terrorist blows up their bridge).

There's a really big difference though. In the physical world, an "attack" is always possible with enough physical force -- no matter how good of a lock you design, someone can still kick down the door, or cut through it, or blow it up. But with computer systems, assuming you don't have physical access, an attack is only possible as a result of a mistake on part of the programmers. Practically speaking, there's no difference between writing an out-of-bounds array access that BSoD's millions of computers, and writing an out-of-bounds array access that opens millions of computers to a zero-day RCE, and the company should not be shielded from blame for their mistake only in the latter case because there's an "attacker" to point fingers at.

Over the past few years of seeing constant security breaches, always as the result of gross negligence on the part of a company -- and seeing those companies get away scot free because they were just innocent "victims of a cyberattack", I've become convinced that the only way executives will care to invest in security is if vulnerabilities come with bankrupt-your-company levels of liability.

Right now, the costs of a catastrophic mistake are borne by the true victims -- the innocent customer who had their data leaked or their computer crashed. Those costs should be born by the entity who made the mistake, and had the power to avoid it by investing in code quality, validating their inputs, using memory-safe languages, testing and reviewing their code, etc.

Yes, we can't just all write bug-free code, and holding companies accountable won't just stop security vulnerabilities overnight. But there's a ton of room for improvement, and with how much we rely on computers for our daily lives now, I'd rather live in a world where corporate executives tell their teams "you need to write this software in Rust because we'll get a huge discount on our liability insurance." It won't be a perfect world, but it'd be a huge improvement over this insane wild west status quo we have right now.

edanm · 2024-08-12T06:49:22.000000Z

> In the physical world, an "attack" is always possible with enough physical force -- no matter how good of a lock you design, someone can still kick down the door, or cut through it, or blow it up. But with computer systems, assuming you don't have physical access, an attack is only possible as a result of a mistake on part of the programmers.

It's exactly the opposite.

In the physical world, you mostly only have to defend against small-time attackers. No bank in the world is safe from, say, an enemy army invading. The way that kind of safety gets handled is by the state itself - that's what the army is for.

In the digital world, you are constantly being attacked by the equivalent of a hundred armies, all the time. Hackers around the world, whether criminals or actual state-actors, are constantly trying to break into any system they can.

So yes, many breaches involve some kind of software issue, but it is impossible to never make any mistake. Just like no physical bank in the world would survive 1000s of teams trying to break in every single day.

naveen99 · 2024-08-12T11:38:56.000000Z

> state-actors, are constantly trying to break into any system they can.

I thought state actors prefer to buy over build. Do they really need to build a Botnet over your personal computer over just expanding their own datacenter ?

ang_cire · 2024-08-13T04:43:28.000000Z

State actors breaking into systems aren't doing it to use them in a botnet...

NobodyNada · 2024-08-12T15:00:41.000000Z

Agreed on all counts.

> In the digital world, you are constantly being attacked by the equivalent of a hundred armies, all the time. Hackers around the world, whether criminals or actual state-actors, are constantly trying to break into any system they can.

This is why I think cyberattacks should be seen from the "victim"'s perspective as something more like a force of nature rather than a crime -- they're ubiquitous and constant, they come from all over the world, and no amount of law enforcement will completely prevent them. If you build a building that can't stand up to the rain or the wind, you're not an innocent victim of the weather, you failed to design a building for the conditions you knew would be there.

(I'm not saying that we shouldn't prosecute cyber crime, but that companies shouldn't be able to get out of liability by saying "it's the criminals' fault").

> So yes, many breaches involve some kind of software issue, but it is impossible to never make any mistake.

It's not possible to never make a mistake, no. But there's a huge spectrum between writing a SQL injection vulnerability and a complicated kernel use-after-free that becomes a zero-click RCE with an NSO-style exploit chain, and I'm much more sympathetic to the latter kind of mistake than the former.

The fact is that most exploits aren't very sophisticated -- someone used string interpolation to build an SQL query, or didn't do any bounds checking at all in their C program, or didn't update 3rd-party software on an internal server for 5 years. And for as long as these kinds of mistakes don't have consequences, there's no incentive for a company to adopt the kind of structural and procedural changes that minimize these risks.

In my ideal world, companies that follow good engineering practices, build systems that are secure by design, and get breached by a nation state actor in a "this could have happened to anyone" attack should be fine, whether through legislation or insurance. But when a company cheaps out on software and develops code in a rush, without attention to security, then they shouldn't get to socialize the costs of the inevitable breach.

0xBDB · 2024-08-13T20:53:30.000000Z

> If you build a building that can't stand up to the rain or the wind, you're not an innocent victim of the weather, you failed to design a building for the conditions you knew would be there.

I genuinely have no idea how liability for civil engineering works, but the evidence of my eyes is that entire Oklahoma towns built by civil engineers get wiped off the map by tornadoes all the time. Therefore I assume either we can't design a tornado-proof building, or civil engineering gets the same cost-benefit analysis as security engineering. The acceptable cost-benefit balance is just different. But we can't be selling $10 million tornado-proof shacks, and we can't be selling $10 million bug-proof small business applications, if either is even possible.

edanm · 2024-08-12T21:10:12.000000Z

> If you build a building that can't stand up to the rain or the wind, you're not an innocent victim of the weather, you failed to design a building for the conditions you knew would be there.

This is why I liken it to protecting from an army. Wanting to protect a building from rain is fine - rain is a constant that isn't adapting and "fighting back".

Find me a building that is able to keep its occupants safe from an invading army, and then we'll talk. It's impossible. That's what we built armies for.

> But there's a huge spectrum between writing a SQL injection vulnerability and a complicated kernel use-after-free that becomes a zero-click RCE with an NSO-style exploit chain, and I'm much more sympathetic to the latter kind of mistake than the former.

To be clear, I agree that there's a spectrum, and I wouldn't want to make it so that companies can get away with everything. But I'm not sure we have a good solution for "my company has 10k engineers, one of them five years ago set up a server and everyone forgot it exists, now it's exploitable". Not in the general case of having so many employees.

> The fact is that most exploits aren't very sophisticated -- someone used string interpolation to build an SQL query, or didn't do any bounds checking at all in their C program, or didn't update 3rd-party software on an internal server for 5 years. And for as long as these kinds of mistakes don't have consequences, there's no incentive for a company to adopt the kind of structural and procedural changes that minimize these risks.

I'm not a security researcher, but I'd guess that most exploits are even simpler - they don't even necessarily rely on software exploits, they rely on phishing, on social engineering, etc.

I've seen plenty of demos of people being able to "hack" many companies by just knowing the lingo and calling a few employees while pretending to be from IT.

This doesn't even include "exploits" like getting spies into a company, or just flat-out blackmailing employees. Do you think the systems you've worked on are secure from a criminal organization applying physical intimidation on IT personnel? (I won't go into details but I'm sure you can imagine worst-case scenarios here yourself.)

> But when a company cheaps out on software and develops code in a rush, without attention to security, then they shouldn't get to socialize the costs of the inevitable breach.

I agree, but there's a huge range between "builds software cheaply" and "builds software which is secure by default" (the second being basically impossible - find me a company that has never been breached if you think it's doable).

We want to make companies pay the cost when it incentivizes good behavior. That's sometimes the case, hence my agreeing with you for many cases.

But security is a game of weakest links, and given thousands of adversaries of various levels of strength, from script-kiddies to state actors, every company is vulnerable on some level. Which is why, in addition to making companies liable for real negligence, we have to recognize that no company is safe, even given enormous levels of effort, and the only way to truly protect them is via some state action.

The reason your bank isn't broken into isn't just that they are amazing at security - it's that if someone breaks into your bank, the state will investigate, hunt them down, arrest them and imprison them.

ang_cire · 2024-08-13T04:46:46.000000Z

Show me a company that claims it's never been breached in some way, and I'll show you a company that has no clue about security, including their prior breaches.

mewpmewp2 · 2024-08-12T01:08:31.000000Z

Having such consequences would completely stop any innovation and put us into a complete technological stagnancy.

Which would of course result in many other and arguably much worse consequences for society.

gjsman-1000 · 2024-08-12T01:16:30.000000Z

Oh, it would do worse than that.

Every country in the world would see this as their big chance to overtake the US. Russia, China, you name it.

You would have to be an idiot to start a software company in the US. High regulation, high cost of living, high taxes, high salaries, personal liability, and a market controlled by monopolies who have the resources to comply.

They’ll leave. The entire world will be offering every incentive to leave. China would offer $50K bonuses to every engineer that emigrated the next day.

meowfly · 2024-08-12T12:11:10.000000Z

I'm confused. Why would they emigrate? You just said "high salaries"?

Moreover, China is hardly low regulation. You would get there and then not be able to check your email.

spott · 2024-08-11T21:15:18.000000Z

This is less complicated than you think.

Civil engineering rules, safety margins and procedures have been established through the years as people died from their absence. The practice of civil engineering is arguably millennia old.

Software is too new to have the same lessons learned and enacted into law.

The problem isn’t that software doesn’t have the kind of practices and procedures that would prevent these kinds of errors, (see the space shuttle code for example), it is that we haven’t formalized their application into law, and the “terms of service” that protects software makers has so far prevented legal case law from ensuring liability if you don’t use them.

Software engineering, compared to other engineering disciplines, has had a massive effect on the world in an incredibly short amount of time.

btilly · 2024-08-11T19:23:17.000000Z

The other side of it is this. By law, a licensed civil engineer must sign off on a civil engineering project. When doing so, the engineer takes personal legal liability. But the fact that the company needs an engineer to take responsibility means that if management tries to cut too many corners, the engineer can tell them to take a hike until they are willing to do it properly.

Both sides have to go together. You have to put authority and responsibility together. In the end, we won't get better software unless programmers are given both authority AND responsibility. Right now programmers are given neither. If one programmer says no, they are just fired for another one who will say yes. Management finds one-sided disclaimers of liability to be cheaper than security. And this is not likely to change any time soon.

Unfortunately the way that these things get changed is that politicians get involved. And let me tell you, whatever solution they come up with is going to be worse for everyone than what we have now. It won't be until several rounds of disaster that there's a chance of getting an actually workable solution.

kevin_thibedeau · 2024-08-11T23:45:12.000000Z

Engineering uses repeatable processes that will ensure the final product works with a safety margin. There is no way to add a safety margin to code. Engineered solutions tend to have limited complexity or parts with limited complexity that can be evaluated on their own. No one can certify that a 1M+ line codebase is free from fatal flaws no matter what the test suite says.

jmb99 · 2024-08-12T03:30:46.000000Z

> There is no way to add a safety margin to code.

This is, in my opinion, an incredibly naive take.

There are currently decades of safety margin in basically all running code on every major OS and device, at every level of execution and operation. Sandboxing, user separation, kernel/userland separation, code signing (of kernels, kernel extensions/modules/drivers, regular applications), MMUs, CPU runlevels, firewalls/NAT, passwords, cryptography, stack/etc protections built into compilers, memory-safe languages, hardware-backed trusted execution, virtualization/containerization, hell even things like code review, version control, static analysis fall under this. And countless more, and more being developed and designed constantly.

The “safety margin” is simply more complex from a classic engineering perspective and still being figured out, and it will never be as simple as “just make the code 5% more safe.” It will take decades, if not longer, to reach a point where any given piece of software could be considered “very safe” like you would any given bridge. But to say that “there is no way to add a safety margin to code” is oversimplifying the issue and akin to throwing your hands up in the air in defeat. That’s not a productive attitude to improve the overall safety of this profession (although it is unfortunately very common, and its commonality is part of the reason we’re in the mess we’re in right now). As the sibling comment says, no one (reasonable) is asking for perfection here, yet. “Good enough” right now generally means not making the same mistakes that have already been made hundreds/thousands/millions of times in the last 6 decades, and working to improve the state of the art gradually over time.

btilly · 2024-08-12T04:22:33.000000Z

Exactly.

Part of the evaluation has to be whether the disaster was due to what should have been preventable. If you're compromised by an APT, no liability. Much like a building is not supposed to stand up to dynamite. But someone fat fingered a configuration, you had no proper test environment as part of deployment, and hospitals and 911 systems went down because of it?

There is a legal term that should apply. That term is "criminal negligence". But that term can't apply for the simple reason that there is no generally accepted standard by which you could be considered negligent.

chestertonsgate · 2024-08-11T23:59:29.000000Z

Except nobody is asking for perfection here. Every time these disasters happen, people reflexively respond to any hint of oversight with stuff like this. And yet, the cockups are always hilariously bad. It's not "oh, we found a 34-step buffer overflow that happens once every century, it's "we pushed an untested update to eight million computers lol oops". If folks are afraid that we can't prevent THAT, then please tell me what software they've worked on so I can never use it ever.

snowwrestler · 2024-08-12T06:42:15.000000Z

An Airbus A380 comprises about 4 million parts yet can be certified and operated within a safety margin.

Not that I think lines of code are equivalent to airplane parts, but we have to quantify complexity some way and you decided to use lines of code in your comment so I’m just continuing with that.

The reality is that we’re still just super early in the engineering discipline of software development. That shows up in poor abstractions (e.g. what is the correct way to measure software complexity), and it shows up in unwillingness of developers to submit themselves to standard abstractions and repeatable processes.

Everyone wants to write their own custom code at whatever level in the stack they think appropriate. This is equivalent to the days when every bridge or machine was hand-made with custom fasteners and locally sourced variable materials. Bridges and machines were less reliable back then too.

Every reliably engineered thing we can think of—bridges, airplanes, buildings, etc.—went through long periods of time when anyone could and would just slap one together in whatever innovative, fast, cheap way they wanted to try. Reliability was low, but so was accountability, and it was fast and fun. Software is largely still in that stage globally. I bet it won’t be like that forever though.

duckmysick · 2024-08-12T06:59:04.000000Z

It seems to me if something is not safe and we can't make it reasonably safe, we shouldn't use it.

cyberax · 2024-08-11T19:00:23.000000Z

This is all true. But we _do_ have known best practices that reduce the impact of bugs.

A most trivial staged rollout would have caught this issue. And we're not talking about multi-week testing, even a few hours of testing would have been fine. Failure to do that rises to the level of gross negligence.

fire_lake · 2024-08-12T07:07:56.000000Z

True but they are under time pressure to add definitions for emerging vulnerabilities.

NegativeK · 2024-08-11T20:28:49.000000Z

Doctors, engineers, and lawyers aren't infinitely accountable to their equivalent of bugs. Structures still fail, patients die, and lawyers lose cases despite the reality of the crime.

But they're liable when they fuck up beyond what their industry decides is acceptable. If Crowdstrike really wasn't testing the final build of their configuration files at all, then yeah -- that's obviously negligent given the potential impact and lack of customer ability to do staged rollouts. But if a software company has a bug that wasn't caught because they can't solve the halting problem, then no professional review board should fault the license holder.

> we first have to find a way to separate innocent run-of-the-mill mistakes from gross negligence - and that's going to be extremely hard to formalize.

I think we just (oh god -- no sentence with a just is actually that easy) need to actually look at other professional licenses to learn how their processes work. Because they've managed to incorporate humans analyzing situations where you can't have perfect information into a real process.

But I don't think any of this will happen while software is still making absolute shit loads of money.

Analemma_ · 2024-08-11T18:39:41.000000Z

This entire comment boils down to "we can't be held accountable because it's soooo hard you guys", which isn't even convincing to me as someone in the industry and certainly won't be to someone outside it.

heyoni · 2024-08-11T18:48:05.000000Z

What a shallow dismissal of a comment that doesn’t even claim that there shouldn’t be accountability.

mvdtnz · 2024-08-11T18:55:16.000000Z

His dismissal is absolutely right though. Programmers have gotten way too used to waving their hands at the pubic and saying "gosh I know it's hard to understand but this stuff is so hard". Well no, sorry, there's not a single <= in place of a < that couldn't have been caught in a unit test.

eropple · 2024-08-11T19:24:46.000000Z

You're right, in the case that it was known to be a problem. There are lots of places where the "<= or <" decision can be made, some long before some guy opens a text editor; in those cases, the unit test might not catch anything because the spec is wrong!

A major difference between software development and engineering is that the requirements must be validated and accepted by the PE as part of the engineering process, and there are legal and cultural rails that exist to make that evaluation protected, and as part of that protection more independent--which I think everyone acknowledges is an imperfect independence, but it's a lot further along than software.

To fairly impute liability to a software professional, that software professional needs to be protected from safety-conscious but profit-harmful decisions. This points to some mixture of legislation (and international legislation at that), along with collective bargaining and unionization. Which are both fine approaches by me, but they also seem to cause a lot of agita from a lot of the same folks who want more software liability.

wizzwizz4 · 2024-08-11T20:34:18.000000Z

> in those cases, the unit test might not catch anything because the spec is wrong!

That's why you have three different, independent parties design everything important thrice, and compare the results. I'm serious. If you're not convinced this is necessary, just take a look at https://ghostwriteattack.com/riscvuzz.pdf.

(Your other suggestions are also necessary, and I don't think that would be sufficient.)

eropple · 2024-08-12T00:26:37.000000Z

I think that's a great idea, and when I've been in a leadership role I've at least tried to have important things done at least twice. ;)

And you're right, I was pretty much just outlining what might be called "a good start".

0xBDB · 2024-08-13T21:06:44.000000Z

> This entire comment boils down to "we can't be held accountable because it's soooo hard you guys", which isn't even convincing to me as someone in the industry and certainly won't be to someone outside it.

When that cargo ship hit the bridge in Baltimore and people were calling for bridges to be designed to take that kind of hit, I heard a lot of "that's sooo impossible you guys" from 'real' engineers. Because it apparently is.

We can do (almost) anything, but we can't always do it for amounts people are willing to pay, where 'we' is everybody and 'willing to pay' means if you charge me what it would take to make it safe or secure, I'll redneck engineer it with none of that built in at all. People are not going to stop finding affordable ways to cross rivers or use web servers just because hard stuff is expensive.

Ferret7446 · 2024-08-13T10:21:57.000000Z

If it's too hard for everyone to do, then yeah, it's too hard.

At the end of the day, what matters is if you can, y'know, do the thing. And people just can't.

> which isn't even convincing to me as someone in the industry

Then you're confident that you can write bulletproof software? Prove it. Thankfully, as an industry we're pretty good at compromising software even if we can't write uncompromisable software.

Since we're talking about serious liability, how about put up a multi million dollar bounty for any single bug found in a non-trivial program that you write?

jjav · 2024-08-15T01:45:12.000000Z

> Contrast that to programming. A single "<" instead of "<=" could be the difference between totally fine and billions of dollars of damages.

Disagree. This is true purely at the coding level, yes. Anyone could make a typo.

If you're running a company that releases software with the risk exposure of crowdstrike, you better not have a release model where that typo goes straight to production. There need to be many layers of different kinds of testing. If carefully built, now there are many layers all of which have to fail for the bug to go live. You can bring down the failure probability down to negligible levels with enough layers of validation.

> find a way to separate innocent run-of-the-mill mistakes from gross negligence - and that's going to be extremely hard to formalize.

I don't think it's that hard. Not saying it is trivial, but it is well within the capability of the industry if we just focused a little bit on quality instead of 100% in profit.

Standardize models and layers of testing coverage. If you implement them all then you're not being negligent and thus should not be liable. If you decide to skip them, liable.

TeMPOraL · 2024-08-11T22:37:48.000000Z

> Nothing is ever designed to its absolute limit, and everything is built with a healthy safety margin. You calculate a bridge to carry bumper-to-bumper freight traffic, during a hurricane, when an earthquake hits - and then add 20%. Not entirely sure about whether a beam can handle it? Just size it up! Suddenly it's a lot less critical for your calculations to be exactly accurate

That may have been true a couple hundred years ago. It's not been true for a couple decades now, because budget became a constraint even more important than physics, and believe it or not, you will have to justify every dollar that goes into your safety margin. That's where the accuracy of modern techniques matter: the more accurate your calculations (and the more consistent inputs and processes builders employ), the less material you can use to get even closer to the designed safety margin. Accidentally making a bridge too safe means setting money on fire, and we can't have that.

That's the curse of progress. Better tools and techniques should allow to get more value - efficiency, safety, utility - for the same effort. Unfortunately, economic pressure makes companies opt for getting same or less[0] value for less effort. Civil engineering suffers from this just as much as software engineering does.

--

[0] - Eventually asymptotically approaching the minimum legal quality standard.

vdqtp3 · 2024-08-12T15:32:22.000000Z

> Accidentally making a bridge too safe means setting money on fire, and we can't have that.

There's a quote I've seen various versions of: anyone can build a bridge that is safe. It takes an engineer to build a bridge that is just barely safe.

wannacboatmovie · 2024-08-11T18:58:36.000000Z

> Contrast that to programming. A single "<" instead of "<=" could be the difference between totally fine and billions of dollars of damages.

I fail to see the difference between a misplaced operator and a misplaced bolt (think Hyatt walkway collapse), both of which could have catastrophic consequences. Do you think the CAD software they use to perform the calculations is allowed have bugs simply because it's software?

Maybe go back to entering code on punch cards if you're so fixated on the physical domain being the problem.

tedunangst · 2024-08-11T20:49:13.000000Z

There's a reason we talk about the Hyatt walkway collapse but not the misplaced operator.

teddyh · 2024-08-12T11:15:36.000000Z

It could happen. People have been predicting it for years, and many think that it is only a matter of time. For a vision from 1982 of how it could happen, see: <https://books.google.com/books?id=6f8VqnZaPQwC&pg=PA167>

Consider the following scenario. We are living in 1997, and the world of office automation has finally arrived. Powerful computers that would have filled a room in 1980 now fit neatly in the bottom of drawer of every executive’s desk, which is nothing more than heavy glass plate covering an array of keyboards, screens, and color displays.

— The Network Revolution: Confessions of a Computer Scientist; Jacques Vallee, 1982

bsaul · 2024-08-11T22:43:23.000000Z

I like the analogy. What would the equivalent of « adding safety margins » to a piece of critical code ? Building three of them with different technologies and making sure all return the same results ?

exe34 · 2024-08-11T19:32:00.000000Z

did they take basic precautions like staged releases, code reviews, integration tests?

if not, then it's literally the engineer equivalent of gross negligence and they do deserve to be sued to oblivion.

darksim905 · 2024-08-12T06:12:47.000000Z

Do people actually believe when a company says something caused billions of dollars of damage? unless you can quantify that, much like law enforcement and articulate suspicion, it's pretty useless as a metric. If you can pull something out of your ass, what does it matter?

jedberg · 2024-08-11T17:14:11.000000Z

Delta threatened to sue them for their $500M loss. Crowdstrike replied (publicaly) pointing out that their contract limits Crowdstrike's liability to single digit millions.

Then then gave them a list of things they would seek in discovery, such as their backup plans, failover plans, testing schedules and results, when their last backup recover exercise was, etc.

Basically, they said, "if you sue us, we will dig so deep into your IT practices that it will be more embarrassing for you than us and show that you were at fault".

mikeocool · 2024-08-11T18:57:47.000000Z

It really seems funny that Crowdstrike’s defense is basically “you should have been better prepared for us to knock all of your systems offline.”

It’s probably true, but seems like an odd stance to take from a PR perspective or a “selling other clients in the future” perspective.

jedberg · 2024-08-11T18:59:32.000000Z

In the case of Delta, their outage was much longer than everyone else because they refused help from both Crowdstrike and Microsoft. So their defense is basically "the damages could have been mitigated if you'd listened to us".

NavinF · 2024-08-11T19:15:25.000000Z

> they refused help from both Crowdstrike and Microsoft

Link?

Anyway I find it highly amusing that Delta is seeking damages from Microsoft even though Microsoft had nothing to do with it.

jedberg · 2024-08-11T19:23:46.000000Z

There are many articles about them refusing help, but here is one:

https://www.theverge.com/2024/8/6/24214371/microsoft-delta-l...

cratermoon · 2024-08-11T22:59:29.000000Z

Delta's position is the Microsoft actively recommended and coordinated with CrowdStrike to the extent that they are co-responsible for outcomes. In a large enterprise like Delta, the vendors do work together in deployment and support. Yes, there's often a great deal of finger-pointing between vendors when something like this happens, but in general vendors so intimately linked have each other on speed-dial. It would not shock me to learn that Delta has email or chat threads involving CrowdStrike, Microsoft, and Delta employees working together during rollouts and upgrades, prior to this event.

As far as refusing help, why is that funny? If someone does something stupid and knocks you down, it's perfectly reasonable to distrust the help they offer, especially if that help requires giving them even more trust than what they've already burned.

hamstercat · 2024-08-12T00:06:40.000000Z

Changing vendors and choosing one that's more reliable is a perfectly sensible outcome of this situation once your system are back up and you're no longer hemorrhaging money.

During an ongoing incident, when all of your operations are down, is not the time for it though. If you think there's even a 1% chance that the help can help, you should probably take it and fix your immediate problem. You can re-evaluate your decisions and vendor choices after that.

troyvit · 2024-08-12T15:34:35.000000Z

> If someone does something stupid and knocks you down, it's perfectly reasonable to distrust the help they offer, especially if that help requires giving them even more trust than what they've already burned.

Yeah it smacks of Experian offering you a year of "free identity theft protection" after having lost your personal data in a breach.

cratermoon · 2024-08-11T22:53:52.000000Z

That's kind of typical of how much companies have been allowed to externalize costs. It's never about how the company at fault should have done better, rather it typically boils down to some variant of "the free markets provided you with a choice about who you trust and it was up to you to collect and evaluate all the information available to make your choices".

cnlwsu · 2024-08-11T19:32:17.000000Z

That’s kinda what aws tells people when its services go down. If your backend can’t take a short outage without weeks of recovery then it’s just a matter of time.

reaperducer · 2024-08-11T17:28:29.000000Z

Delta threatened to sue them for their $500M loss. Crowdstrike replied (publicaly) pointing out that their contract limits Crowdstrike's liability to single digit millions.

Delta's move seems like an attempt to assuage shareholders and help the C.E.O. save face.

Crowdstrike shouldn't be afraid of Delta. Crowdstrike should be afraid of the insurance companies that have to pay out to those businesses that have coverage that includes events like this.

Even if the payout to a company is $10,000, a big insurance company may have hundreds or thousands of similar payouts to make. The insurance companies won't just let that go; and they know exactly what to look for, how to find it, and have the people, lawyers, and time to make it happen.

Crowdstrike will get its day of reckoning. It won't be today. And it probably won't be public. But the insurance companies will make sure it comes, and it's going to hurt.

JumpCrisscross · 2024-08-11T17:32:17.000000Z

> the insurance companies will make sure it comes, and it's going to hurt

It could be as simple as a reinsurer refusing to renew coverage if a company uses CrowdStrike.

vladvasiliu · 2024-08-11T17:36:30.000000Z

Which would be funny, since many companies are putting up with Crowdstrike to make insurers happy.

dredmorbius · 2024-08-11T18:36:16.000000Z

Availability (or not) of insurance coverage is surprisingly effective in enabling or disabling various commercial ventures.

The penny dropped for me whilst reading James Burke's Connections on the exceedingly-delayed introduction of the lateen-rigged sail to Europe, largely on the basis that the syndicates which underwrote (and insured) shipping voyages wouldn't provide financing and coverage to ships so rigged.

Far more recently we have notions of redlining for both mortgage lending and insurance coverage (title, mortgage, property, casualty) in inner-city housing and retail markets. Co-inventor of packet-based switching writes of his parents' experience with this in Philadelphia:

"On the Future Computer Era: Modification of the American Character and the Role of the Engineer, or, A Little Caution in the Haste to Number" (1968)

<https://www.rand.org/pubs/papers/P3780.html> (footnote, p. 6).

Similarly, government insurance or guarantees (Medicare, SSI, flood insurance, nuclear power plants) has made high-risk prospects possible, or enabled effective services and markets, where laissez-faire approaches would break down.

I propose that similar approaches to issues such as privacy violation might be worth investigating. E.g., voiding any insurance policy over damages caused through the harmful use or unintended disclosure of private information. Much of the current surveillance-capitalism sector would instantly become toxic. The principle current barriers to this are that states themselves benefit through such surveillance, and of course the current industry is highly effective at lobbying for its continuance.

selimthegrim · 2024-08-11T23:24:13.000000Z

That’s interesting because on the TV episode, it states that insurers wanted the risk of piracy spread out over many smaller ships that would be lateen rigged. I have one of the Connections books, so I’ll check to see if this is covered in it https://youtu.be/1NqRbBvujHY?si=WfysDHPLhSJkGhzd

dredmorbius · 2024-08-12T01:59:50.000000Z

Interesting discrepancy, yes. I'm pretty sure of my recollection of the book.

It may be that the opportunity to diversify risk (over more smaller ships) overcame the reluctance to adopt new, untested and/or foreign technology.

selimthegrim · 2024-08-12T03:42:22.000000Z

It doesn’t explicitly say insurers but it’s a pretty small logical leap from the wording (the timeframe is also c. 11th-12th century so could be before formal insurers)

dredmorbius · 2024-08-12T03:51:56.000000Z

Right.

The books and video scripts also differ amongst Burke's various series. I'll see if I can find a copy of the text to compare.

JumpCrisscross · 2024-08-11T17:19:10.000000Z

> they said, "if you sue us, we will dig so deep into your IT practices that it will be more embarrassing for you than us and show that you were at fault"

But CrowdStrike said this publicly. If they’d privately relayed it to Delta, it would have been genuine. By performatively relaying it, however, it seems they’re pre-managing optics around the expected suit.

wjnc · 2024-08-11T18:42:16.000000Z

It’s an argument that hits home at any bigcorp where the execs are entertaining the thought of suing CrowdStrike. Making it public once is a lot more effective than relaying it privately a hundred times. I expect most liability to come from abroad, where parts of the contract might be annulled because not in line with local law. But still I don’t expect it. CrowdStrike delivered the service they promised. The rest is on the customers IT. Hand over the keys and your car may be driven.

JumpCrisscross · 2024-08-11T18:45:25.000000Z

> It’s an argument that hits home at any bigcorp where the execs are entertaining the thought of suing CrowdStrike

Maybe? Discovery is a core element of any lawsuit. It’s also a protected process: you can’t troll through confidential stuff with an intent to make it public to damage the litigant.

If anything, I could see Delta pointing to this statement to restrict what CrowdStrike accesses and how [1]. (As well as with the judge when debating what gets redacted or sealed.)

[1] https://www.fjc.gov/sites/default/files/2012/ConfidentialDis...

wjnc · 2024-08-12T04:40:08.000000Z

Thank you. Nice read. Even given a protective order to keep discovery confidential, the ensuing discussion about the clients lacking IT-policies that exacerbated this crisis is public.

Most entertaining would be the discussion where CrowdStrike would argue that based on common IT-risk criteria, you should never hand over the keys to an unaudited party not practicing common IT-risk best practices and (thus) the liability is on the organization. Talk about CrowdStrike managing risks worldwide. They are doing it right now!

gnfargbl · 2024-08-11T17:21:01.000000Z

Or attempting to discourage it from becoming a pile-on.

magic_man · 2024-08-11T17:34:56.000000Z

It doesn't matter it was 100% crowdstrikes fault. Surprised its still worth 60billion dollars.

from-nibly · 2024-08-11T17:42:27.000000Z

Part of the problem is assuming you can pay a contract to shift your liability completely away.

unyttigfjelltol · 2024-08-11T18:12:45.000000Z

Right, the risk structure presumably protects the vendor if just one customer sues, even if the amount of damages claimed is astronomical. Because vendors try to disclaim bet-the-company liability on a single contract.[1] The vendor's game is to make sure the rest of the customer base does not follow this example, because as noted in the linked article while vendors don't accept bet-the-company liability on each contract (or try not to), they do normally have some significant exposure measured in multiples of annual spend.

[1] https://www.gs2law.com/blog/current-trends-in-liability-limi...

TeMPOraL · 2024-08-11T22:58:27.000000Z

The assumption is not only perfectly valid, it's the very reason such contracts are signed in the first place! It's what companies want to buy, and it's what IT security companies exist to sell.

from-nibly · 2024-08-12T15:33:52.000000Z

Yes, I know that's what everyone wants/thinks, but you actually can't do it. Because at the end of the day, you chose the vendor. So you are still liable for all of it.

iwontberude · 2024-08-11T18:02:59.000000Z

Well if MSFT knew how to write MSAs Crowdstrike would have become property of Microsoft.

pknomad · 2024-08-11T18:07:04.000000Z

Yes and no.

Crowdstrike was the executioner of this epic fail for sure but their archaic infra practices made it even worse. Both Crowdstrike and Microsoft CEOs reached out only to be rebuffed by Delta's own. If I was the CEO - I'd accept any help I can get while you have the benefit of the public opinion.

/tin-foil-hat-on Flat out refusal for help makes me think there are other skeletons in the closet that makes Delta look even worse /tin-foil-hat-off

JumpCrisscross · 2024-08-11T18:51:20.000000Z

> I was the CEO - I'd accept any help I can get while you have the benefit of the public opinion

I’d reserve judgement. Delta may have been cautious about giving the arsonists a wider remit.

TeMPOraL · 2024-08-11T23:05:35.000000Z

In this case, the fire was an accident, and the arsonists happen to be the expert firefighters, and they're very motivated to fix their mistake. They're still the experts in all stuff fire, whereas Delta is not.

pknomad · 2024-08-11T22:39:31.000000Z

Using your analogy - if MS/CS are the arsonists, then Delta are the landlords unsafely storing ammonium nitrate in their own warehouse.

Their lack of response to MS/CS isn't coming from a place of reducing potential additional problems but trying to shield their own inadequacies while a potential lawsuit is brewing in the background.

https://www.reuters.com/technology/microsoft-blames-delta-it...

ImJamal · 2024-08-12T03:13:41.000000Z

It doesn't seem like arsonist is the right word. It implies it was intentional, which as far as I can tell there is no proof of.

I think the more accurate description would be some firefighters were doing a controlled burn. The burn got out of controlled and then you say that you don't want the firefighters help in put out the fire.

evilduck · 2024-08-11T19:49:17.000000Z

If you held the view that CrowdStrike and Microsoft were inherently to blame for the problem why would you trust them to meaningfully help? At best they're only capable of getting you right back into the same position that left you vulnerable to begin with.

pknomad · 2024-08-11T22:36:51.000000Z

Same reason why an aircraft manufacturing company would get involved in a NTSB investigation when there is an airplane crash. Just because they messed up one or more things (i.e. MCAS on MAX) doesn't mean they can't provide expertise or additional resources to at least help with the problem.

Your take also casually disregards the fact that Delta took an extraordinary time to recover from the problem when the other companies recovered (albeit slowly). This is the point that I'm getting at. It isn't that CS and MS aren't culpable for the outage; it's that DAL also contributed to the problem by not adequately investing in its infra.

cratermoon · 2024-08-11T23:08:07.000000Z

> Same reason why an aircraft manufacturing company would get involved in a NTSB investigation when there is an airplane crash

Key difference here is that the NTSB is third party with force of law behind it. The victims in the crash – airlines and passengers – aren't rushing to the aircraft manufacturer to come fix things. Quite the opposite: the NTSB and FAA have the authority to quarantine a crash site and ensure nobody tampers with the evidence. Possible tampering with black boxes was an issue in the investigation of Air France Flight 296Q.

hamstercat · 2024-08-12T00:18:22.000000Z

Being to blame is different than being actively trying to sabotage you. Many companies will be re-evaluating their relationship after this problem happened, but doing that while your systems aren't functional seems counter-productive.

tedunangst · 2024-08-11T20:51:23.000000Z

Seems fair. Delta didn't privately relay their intentions.

cyanydeez · 2024-08-11T18:17:11.000000Z

Weirdly, we live in a society

otterley · 2024-08-12T06:06:44.000000Z

That’s not the way legal process works. CrowdStrike might be permitted to conduct discovery, but that won’t entitle them to share what they might find with the public, embarrassing or otherwise. Business records and other sensitive information relating to parties in civil matters are frequently sealed.

richardw · 2024-08-11T18:18:00.000000Z

I’m not sure anything else was material given that the machines were bricked and client roll-out approaches were evaded by Crowdstrike. What client actions would have helped?

Surely someone is looking at a class action? People died. The contract can’t make that everyone else’s problem, can it?

gizmo686 · 2024-08-11T19:19:56.000000Z

Sure it can. If every rock climbing company in the country decides that climbing ropes are too expensive and instead decide to by rope from the local hardware store, and that rope has a warning reading "not for use when life or valuable property is at risk", then it is 100% on those climbing companies when people die, because they were using a product in a situation that it was simply not suitable for.

The details, of course, depend on the contract and claims that Crowdstrike made. But, in the abstract, you are not responsible for making your product suitable for any use that anyone decides to use it for.

If a hospital wants to install software on their life critical infastructure, they are supposed to buy software that is suitable for life critical infastructure.

lupire · 2024-08-11T19:18:50.000000Z

If someone's life depends on a networked Windows (or any similar OS) machine you chose to run for that purpose, you are the criminal.

1over137 · 2024-08-11T21:50:21.000000Z

Indeed. But this is how hospitals run.

icehawk · 2024-08-11T21:52:31.000000Z

I'd LOVE to see Crowdstrike do this. The last time I dealt with the specifics of this sort of validation testing for security software was a decade and from what I saw in the RCA Delta can just keep pointing out that whatever they had worked until Crowdstrike failed to understand that the number 20 and the number 21 are not the same:

The new IPC Template Type defined 21 input parameter fields, but the integration code that invoked the Content Interpreter with Channel File 291’s Template Instances supplied only 20 input values to match against. This parameter count mismatch evaded multiple layers of build validation and testing, as it was not discovered during the sensor release testing process, the Template Type (using a test Template Instance) stress testing or the first several successful deployments of IPC Template Instances in the field.

This combined with the lack of partitioning updates, makes me draw the conclusions they're missing table stakes WRT to validation.

mikhael28 · 2024-08-12T01:30:09.000000Z

Wtf how do you not check for ‘quantity of arguments’ in QA testing?

trentnix · 2024-08-11T18:05:42.000000Z

They should be providing all that information regularly to auditors anyway. If they don’t have it handy, then their IT leadership should be replaced.

mc32 · 2024-08-11T18:08:44.000000Z

That’s odd. One is an internal process which has no obligation to an external party, and the other one who is specifically responsible for being liable for any repercussions due to deviating from their own SDLC process[1]they totally skipped themselves?

If I were Delta, I’d get other affected parties and together sue CrowdStrike and get all their dirty laundry out in the open.

[1] I haven’t checked but they used to list all their ISO certs, etc. Wonder if those get revoked for such glaring violations…

reaperman · 2024-08-11T18:14:59.000000Z

Civil suits focus in a large way on determining how much damage is each party’s fault. So Crowdstrike would be saying “Of this $500M in damages, x% was from your own shitty practices not from our mistake”. Thats why it’s all pertinent.

otterley · 2024-08-12T06:08:45.000000Z

Correct. The legal term is “contributory negligence.”

MattGaiser · 2024-08-11T18:22:09.000000Z

> One is an internal process which has no obligation to an external party

Delta has obligations to their passengers and similarly sidesteps screw ups with similar contractual provisions. How much would Delta owe for not following similar IT practices? Do they now owe customers for their IT failings? Should customers now get to sue Delta for damages related to their poor IT recovery compared to other airlines?

mc32 · 2024-08-11T18:24:54.000000Z

Sure but that’d be something passengers could bring up in a suit against Delta, not someone like CS, who themselves obviously skipped their own internal SDLC and whatever other ISO certs they prominently advertised on their website.

dredmorbius · 2024-08-11T18:37:03.000000Z

Crowdstrike's discovery process would greatly aid in passenger or general-public suits against Delta.

jc2jc · 2024-08-11T18:46:00.000000Z

I assume the argument is that if they can show negligence in their IT practices, then the $500 million in damages can't be all attributed to CrowdStrike's failure.

VirusNewbie · 2024-08-11T18:14:46.000000Z

They might find out delta does embarrassing things like not testing out of bounds array access or does global deployments without canarying.

LordKeren · 2024-08-11T17:01:05.000000Z

There is recourse, just not for normal people, as you eluded to. Companies are and will be continuing to sue crowdstrike, and based on the papers that crowdstrike has posted, the impacted companies are extremely likely to be successful. It seems overwhelmingly likely that the companies are going to be able to convince a judge/jury/arbiter that crowdstrike acted grossly negligent and very plainly caused both direct losses and indirect reputational harm to the companies.

I’m not sure crowdstrike will even fight it, to be honest. I would assume most of this is going to be settled out of court and we will see crowdstrike crumble in the coming years.

dredmorbius · 2024-08-11T18:38:09.000000Z

NB: alluded, not eluded.

<https://dict.org/bin/Dict?Form=Dict2&Database=gcide&Query=Al...>

<https://dict.org/bin/Dict?Form=Dict2&Database=gcide&Query=el...>

LordKeren · 2024-08-12T20:26:54.000000Z

TIL— thanks! Now it’s time to painfully go through my slack / email history and see how many times I made this mistake :)

JumpCrisscross · 2024-08-11T17:06:12.000000Z

> not sure crowdstrike will even fight it

To my knowledge only Delta is suing and CrowdStrike is kicking and screaming about it [1].

[1] https://www.cnn.com/2024/08/05/business/crowdstrike-fires-ba...

LordKeren · 2024-08-11T17:25:14.000000Z

It’s a really bad look for crowdstrike to be going down this route. Then again, I don’t think many companies are going to be adopting crowdstrike in the coming years, so I suppose their only option is to defend their stock value at any cost while the company recoils

dylan604 · 2024-08-11T17:33:15.000000Z

A lot of companies have insurance on events causing them to lose sources of income. Whether that's farmers having crop insurance, big box retailers having insurance for catastrophic damage to their big box, I would assume there's something for infrastructure collapse to bring sales to $0 for the duration.

Even if everyone that was affected sued ClownStrike for 100% of their losses, it's not like ClownStrike has the revenue to cover those losses. So even if you're a fan of shutting them down, nobody recovers anything close to actual losses.

So what would you actually propose? Bug free code is pretty much impossible. Some risk is accepted by the user. Do you seriously think that software should be absolutely 100% bug free before being able to be used? How do you prove that? Of course, the follow up would be how clean is your code that you feel that's even achievable?

Rinzler89 · 2024-08-11T17:52:24.000000Z

>Bug free code is pretty much impossible. Some risk is accepted by the user.

This wasn't your average SW bug, it was gross negligence on behalf of Crowdstreike, who seems to not have heard of SW testing on actual systems and canary deployment. Big difference.

Yeah SW bugs happen all the time but you have to show you took some steps to prevent them, while some dev at Crowdstrike just said "whatever, it works on my machine" and directly pushed to all customer production systems on a Friday. That's the definition of gross negligence that they didn't have any processes in place to prevent something like this.

That's like a surgeon not bothering to sterilize his hands and then saying "oh well, hospital infections happen all the time".

dylan604 · 2024-08-11T18:35:24.000000Z

> That's like a surgeon not bothering to sterilize his hands and then saying "oh well, hospital infections happen all the time".

And hospitals and doctors have malpractice insurance. They also go through an investigation where they have their own brotherhood where it is difficult to get other doctors to testify against. There's also stories of people writing on their good leg "The other leg" in Sharpie because such moronic mistakes of removing left appendage instead of right. So even doctors are not above negligence. We just have things in place for when they do. Why you think ClownStrike is above that is bewildering.

At the end of the day, mistakes happen. It's not like they have denied they were at fault. So I'm really not sure what you're actually wanting.

Rinzler89 · 2024-08-12T05:55:06.000000Z

>It's not like they have denied they were at fault. So I'm really not sure what you're actually wanting.

Paying for their mistake. In money. Admitting for their mistake is one thing, paying for it is another.

If your doctor made a mistake due to his negligence that costs you, wouldn't you want compensation instead of just a hollow apology?

dylan604 · 2024-08-12T13:58:34.000000Z

Want vs receive are two entirely different things. If someone did something against me in malice, damn straight I want ________. If someone makes a mistake, owns up to it, changes in ways to not make same mistake again, then that's exactly the opportunity I'd hope someone would allow for me to have if the roles were reversed. This particular company's mistake just happened to be so widespread due to their popularity makes it seemingly egregious, but there are other outages that have occurred that lasted longer and did not draw this much attention. Was it an inconvenience, yes. Was it a silly mistake in hindsight, yes. Was it fixable, yes. Was it malevolent, nope. Should you lose your job for making this mistake?

ncr100 · 2024-08-11T18:35:20.000000Z

The bug was egregious.

Using regexp (edit: in the kernel). (Wtf. It's a bloody language.) And not sanitizing the usage. Then using it differently than testing. And boom.

There's people, and there's companies.

This company ought to be nuked.

cqqxo4zV46cp · 2024-08-11T23:37:18.000000Z

Genuinely, what good does that do?

It’s all well and good to write dramatic meaningless comments on social networks like Hacker News, but if your desired had actual consequence, can you honestly say that “nuking the company” is a net positive?

chestertonsgate · 2024-08-11T23:51:07.000000Z

Is keeping CrowdStrike around a net positive?

jjav · 2024-08-15T01:58:57.000000Z

> can you honestly say that “nuking the company” is a net positive?

Yes of course it would be positive. On the short term, remove one incompetent high-risk company from the industry.

But more importantly long term, it would do a lot to encourage quality in the industry if it was known that such an outcome is possible.

ncr100 · 2024-08-12T23:54:52.000000Z

Well, in the America we've got something called corporate personhood and it's this odd concept. It seems like an unfair concept to I don't know to me as a citizen of America.

And you know laws are supposed to keep feeling like you're living in a fair world right?

So, nuke the company That cause billions Of dollars in losses, millions of hours of wasted human time, potentially loss of life though we haven't you know had a study yet that identifies those people who lost their lives because of disruption to healthcare services, heart attacks that were due to stress, etc etc. Nuke them. Nuke that corporate person. Force the humans who comprise that corporation to rebuild it as a better corporation.

jimmySixDOF · 2024-08-12T00:26:10.000000Z

You should look up Arthur Anderson

chestertonsgate · 2024-08-11T23:50:20.000000Z

Bug-free code is impossible. Stupid, negligent bug-free code, however, is very much doable. You just can't hire anyone who happens to be able to fog a mirror to write it.

dylan604 · 2024-08-12T00:23:24.000000Z

If you think this was written by a moron vs a break down in procedures, then I'd think you'd be one that barely fogs a mirror. This is no different the multiple times that AWS us-east-1 has gone down and taken down a large portion of the internet when they've pushed changes. Do you think AWS is hiring moronic mirror foggers causing havoc or just examples of how even within a bureaucratic structure within AWS it is still possible to side step best laid plans?

cesarb · 2024-08-11T18:01:52.000000Z

> Is there any path for software engineers to reach this level of accountability and norms of good practice?

Yes, time. Civil engineering has thousands of years of history. Software engineering is much newer, the foundations of our craft are still in flux. There have been, at least in my country, legislative proposals for licensure of system analysts, electronic computer programmers, data processing machine operators, and typists(!) since the late 1970s; these laws, if approved, would have set back the progress of software development in my country for several decades (for instance, one proposal would make "manipulation and operation of electronic processing devices or machines, including terminals (digital or visual)" exclusive to those licensed as "data processing machine operator").

Cpoll · 2024-08-11T18:21:14.000000Z

> set back the progress

> exclusive to those licensed

Sounds to me like it just would've made a lot of money for whatever entities give out the licenses.

On the other hand, I've read speculation on here that some countries are short on entrepreneurs entirely due to the difficulty of incorporating a small business, so maybe.

miki123211 · 2024-08-11T16:51:58.000000Z

Civil engineering mostly requires you to have a government-verified certificate and to work in the country your infrastructure will be deployed in.

Software engineering doesn't, and that makes criminal prosecutions that much harder. There's no path to making it happen.

Financial liability for the company in question? Sure, that's probably doable. "Piercing the corporate veil" and punishing the executives who signed off on it? Harder but not impossible. Punishing the engineer who wrote that code, and who lives in a country with no such laws? Won't happen.

WarOnPrivacy · 2024-08-11T17:14:14.000000Z

> Civil engineering mostly requires you to have a government-verified certificate and to work in the country your infrastructure will be deployed in.

It's a relatively small (and sharply defined) pool of people who can be called a civil engineer.

Are we saying we want to segment software engineering (from coding) - the same way civil engineering is segmented from construction?

Otherwise we're talking about placing specialist liability upon a non-specialist group. This seems unethical.

JumpCrisscross · 2024-08-11T17:00:00.000000Z

> If a bridge collapses, there’s not only financial liability but the potential for criminal liability as well

If a bridge collapses people die. To my knowledge, nobody died or was put in mortal peril as a result of the Crowdstrike debacle.

rightbyte · 2024-08-11T17:01:26.000000Z

The deaths if any where probably indirect. E.g. ambulances not turning up in time etc. due to paper and pen fallbacks.

With all the hospitals victim of the attack, I would be surprised if the amount of patients that died are zero.

JumpCrisscross · 2024-08-11T17:05:05.000000Z

> E.g. ambulances not turning up in time etc. due to paper and pen fallbacks

Sure. Did this happen?

Why were the “emergency management downtime procedures” insufficient [1]?

[1] https://www.healthcaredive.com/news/crowdstrike-outage-hits-...

tgsovlerkhgsel · 2024-08-11T17:18:36.000000Z

If they were equally good as the non-emergency procedures, why wouldn't we use them all the time?

JumpCrisscross · 2024-08-11T17:19:41.000000Z

> why wouldn't we use them all the time?

Because they’re more expensive. They’re all not “equally good,” they’re good enough to keep people alive. (You repurpose resources from elective and billing procedures, et cetera.)

tgsovlerkhgsel · 2024-08-12T01:58:59.000000Z

I would expect them to be good enough to prevent "obvious" deaths-from-failed-procedures, but deliver a slightly lower quality of care, so that if out of 100 very seriously ill people 50 survived during normal operation, this would turn into e.g. 49.

All of this without the person obviously dying due to the alternative procedures - just e.g. the doctor saw the patient less often and didn't notice some condition as early as they would have under normal procedures.

Would you consider this assumption to be wrong? (I am a layperson, not familiar with how hospitals work except from being a patient.)

FireBeyond · 2024-08-11T17:27:07.000000Z

This belies a lack of understanding.

What resources are you repurposing from elective procedures exactly? Your patient load hasn’t changed, and day surgical instruments and supplies are from the same pool. There’s no “well this pile of equipment is only for elective procedures”.

I’m not even sure what “billing procedures you’d repurpose (especially in your context of “keeping people alive”).

JumpCrisscross · 2024-08-11T17:30:33.000000Z

> Your patient load hasn’t changed, and day surgical instruments and supplies are from the same pool

The outage didn’t change any of these things either.

> not even sure what “billing procedures you’d repurpose

At Mount Sinai, billing staff were redirected to watch newborn babies. Apparently the electronic doors stopped working during the outage.

FireBeyond · 2024-08-11T23:06:05.000000Z

> The outage didn’t change any of these things either.

Never said that it did. I just don't think your idea of emergency downtime procedures at a hospital are what they are. There's paper and offline charting, most meds can be retrieved similarly, and so on. I heard a claim (from someone here) that an ER was unable to do CPR due to the outage, which could not be remotely true. Crash carts are available and are specifically set up to not require anything else but a combination. Drugs, IV/IO access, etc.

> At Mount Sinai, billing staff were redirected to watch newborn babies.

That sounds like something I would have imagined security doing. To be clear, what they most likely meant here is in the sense of "avoiding abduction of a newborn", not any kind of access to observe and oversee neonates.

bt1a · 2024-08-11T17:13:25.000000Z

Probably because our incredibly inefficient, burdened, and splintered healthcare system barely functions as is, and they do not have the time nor resources to pause and put in place an emergency downtime operating protocol that works as well as their 15 year old windows cobweb

JumpCrisscross · 2024-08-11T17:15:23.000000Z

> because our incredibly inefficient, burdened, and splintered healthcare system barely functions as is, and they do not have the time nor resources to pause and put in place an emergency downtime operating protocol

You just responded to an article about the implementation of emergency downtime protocols by speculating, baselessly, that such protocols cannot possibly exist because your mental model of our healthcare system prohibits it. Ironically, all within the context of why software development doesn’t hold itself to the rigors of engineering.

rightbyte · 2024-08-11T17:52:59.000000Z

"In Alaska, both non-emergency and 911 calls went unanswered at multiple dispatch centers for seven hours.

Some personnel were shifted to the centers that were still up and running to help with their increased load of calls, while others switched to analog phone systems, Austin McDaniel, state public safety department spokesperson, told USA TODAY in an email. McDaniel said they had a plan in place, but the situation was "certainly unique.”

Agencies in at least seven states reported temporary outages, including the St. Louis County Sheriff's Office, the Faribault Police Department in Minnesota, and 911 systems in New Hampshire, Fulton County, Indiana, and Middletown, Ohio. Reports of 911 outages across the country peaked at more than 100 on Friday just before 3 a.m., according to Downdetector.

In Noble County, Indiana, about 30 miles northwest of Fort Wayne, 911 dispatchers were forced to jot down notes by hand when the system went down in the early morning hours, according to Gabe Creech, the county's emergency management director."

https://eu.usatoday.com/story/news/nation/2024/07/19/crowdst...

I mean, even if the dispatch could handle it in some sense, certainly it was a problem, that might have increased average time to site for the ambulance or fire fighters. I've haven't seen any report of any direct death.

JumpCrisscross · 2024-08-11T17:59:38.000000Z

> I've haven't seen any report of any direct death

Exactly. Contrast that with a bridge collapse. It isn’t a mystery or statistical exercise to deduce who died and why.

varjag · 2024-08-11T18:38:39.000000Z

There were numerous bridge collapses without casualties. Naturally if one company could suddenly collapse 80% of Earth's bridges, direct deaths would be assured. It's great there isn't one for some reason!

JumpCrisscross · 2024-08-11T18:53:58.000000Z

> were numerous bridge collapses without casualties

In how many of those cases were criminal charges brought? (It’s not zero. But it’s more limited.)

Log_out_ · 2024-08-11T17:23:39.000000Z

Because energencx downtime is not supossed to be local and global. Dont worry your startup will not eat those riscs, but neither will those customers stay once insurrance rewrites the guidlines. All that can happen,has already happened, its just consequences propagating now. Nothing we can do with simple blameshifting tactics.

jimmySixDOF · 2024-08-12T00:45:51.000000Z

I have argued for years every business should have an analogue operations guide tested every once in a while like a fire drill down to pre-printed carbon copypaper forms. A Lights Out Phones Off Business Continuity Plan would have helped American Airlines too.

SkiFire13 · 2024-08-11T17:04:24.000000Z

Hospitals were affected too, I don't think it's that far fetched to think some people died, or at least some could not have been saved due to this incident.

JumpCrisscross · 2024-08-11T17:08:24.000000Z

> Hospitals were affected too, I don't think it's that far fetched to think some people died

Absent evidence I’d say it is.

Hospitals have emergency downtime procedures [1]. From what I can tell, the outage was stressful, not deadly.

[1] https://www.npr.org/2024/07/21/nx-s1-5046700/the-crowdstrike...

tgsovlerkhgsel · 2024-08-11T17:18:00.000000Z

Apply additional stress to a sufficiently large system that human lives depend on, and someone, somewhere will die.

JumpCrisscross · 2024-08-11T17:23:32.000000Z

> Apply additional stress to a sufficiently large system that human lives depend on, and someone, somewhere will die

Sure. Who did?

When a bridge collapses, this isn’t a tough problem. We don’t need to reason from first principles to derive the dead bodies. That’s the difference.

trentnix · 2024-08-11T18:10:49.000000Z

Hospitals and doctor’s offices were paralyzed by the outage. Transplant organs are often delivered by couriers on commercial flights. Many pharmacies were unable to fulfill prescriptions.

It wasn’t just vacation travelers that were affected by Crowdstrike’s incompetence.

bt1a · 2024-08-11T17:11:18.000000Z

I am positive that people in hospitals died as a direct result of this incident.

0xBDB · 2024-08-13T21:30:45.000000Z

> I am positive that people in hospitals died as a direct result of this incident.

I'm less positive than you, just because my experience of healthcare infosec is that all a doctor has to do is say "I cannot be slowed down or prevented from doing x or people will die" and that's the end of any process or technical controls on x.

Same with utilities. I've seen the ICS engineers say "No you cannot put a password on this console because I may need instant access to prevent a blackout / explosion" and that pretty much ends the discussion.

Often that's not even wrong. Of course when there is a security incident there'll be a kneejerk reaction to that, and of course that's why ransomware groups love healthcare, but in the meantime, those risks seem reasonable.

Which means I'm guessing Crowdstrike killed a lot of healthcare billing but not a lot of critical care systems because it got ripped off those 30 seconds after install if it was ever installed at all.

JumpCrisscross · 2024-08-11T17:12:15.000000Z

> I am positive that people in hospitals died as a direct result of this incident

Do you have clinical or hospital administration experience? A source with evidence, even circumstantial?

bt1a · 2024-08-11T17:14:17.000000Z

JumpCrisscross · 2024-08-11T17:17:00.000000Z

> Do you have clinical or hospital administration experience? A source…

>> Yes

You managed a hospital and failed to implement emergency downtime procedures? (Because that is actually criminal.) Or do you have a source?

FireBeyond · 2024-08-11T17:23:24.000000Z

Apropos of anything else, “emergency downtime procedures” do not guarantee the same level of care as normal operations. I’ve worked in and out of hospitals as a critical care paramedic for years.

JumpCrisscross · 2024-08-11T17:26:40.000000Z

> “emergency downtime procedures” do not guarantee the same level of care as normal operations

Agreed. It’s also plausible someone had a heart attack due to the stress of flight cancellations. Do we have any evidence of either?

The difference between a bridge collapsing and everything we’re discussing is there isn’t much of a discussion around who died and why.

cholantesh · 2024-08-11T17:19:52.000000Z

Deft goalpost shifting, nice.

RobRivera · 2024-08-11T18:02:31.000000Z

Are you the orangutan doctor from futurama?

JumpCrisscross · 2024-08-11T18:58:36.000000Z

The commenter said they did not believe hospitals “have the time nor resources to pause and put in place an emergency downtime operating protocol” [1]. That is a reasonable guess. It’s not something one would expect from someone with “clinical or hospital administration experience.”

It’s a glib response, but so is “yes” to a request for attribution.

[1] https://news.ycombinator.com/item?id=41217683

chestertonsgate · 2024-08-11T23:48:37.000000Z

Small reminder that the law already has a way of deciding liability for damages, and you don't have to directly drop a bridge on someone to get in trouble.

maccard · 2024-08-11T22:41:46.000000Z

I completely agree. When I've negotiated contracts for my workplace, and we explicitly write in the contract that the vendor is responsible for XYZ, it is my understanding (and confirmed by legal, multiple times) that this means in case of XYZ going wrong, they are liable for up to the amount in the SLA, however that isn't a cap on liability in extenuating circumstances.

If this all gets brushed away, it significantly devalues the "well we pay $VENDOR to manage our user data, it's on them if they store it incorrectly" proposition, which would absolutely cause us to renegotiate.

cqqxo4zV46cp · 2024-08-11T23:32:57.000000Z

You aren’t showing us the specific language that you’re referring to, nor do we know what a typical CrowdStrike contract looks like. You could be talking about apples and oranges here. I’ve seen both.

maccard · 2024-08-12T09:00:55.000000Z

I was pretty sure that someone was going to "ackshually" me here, and here we are. The specific wording doesn't matter.

I've negotioated dozens of these contracts and the value add of a vendor managing the data is liability. If they aren't liable for data mis-management, then their managed service is only worth the infra costs + a haircut on top, and we'll renegotiate with this in mind.

2OEH8eoCRo0 · 2024-08-11T16:46:47.000000Z

> Is there any path for software engineers to reach this level of accountability and norms of good practice?

There is no reason that software couldn't be treated with the same care and respect. The only reason we don't is because the industry resists that sort of change. They want to move fast and break things while still calling themselves "engineers." Almost none of this resembles engineering.

figassis · 2024-08-11T16:59:19.000000Z

I’m a software engineer, with a degree, and SWE does have the same ethical principles and the same engineering process, from problem definition and requirements a the way to development lifecycle, testing, deployment indigent management, etc. none of it includes sprints and story points.

Suffice it to say most SWEs are not being hired to do actual engineering, bc the industry can’t get over the fact that just because you can update and release SW instantly doesn’t mean you should.

JumpCrisscross · 2024-08-11T17:00:57.000000Z

> SWE does have the same ethical principles and the same engineering process

The lack of certification means this training isn’t reinforced to the degree it is in engineering.

WarOnPrivacy · 2024-08-11T17:44:19.000000Z

Right. If the coding industry mimics the construction industry, we wind up with one position called engineer that assumes most of the liability.

The other 99.99....% of software engineers will get different titles.

All of this ignores the individuals who are most responsible for these catastrophes.

Investors and executives deliver relentless and effective pressure toward practices that maximize their profits - at the expense of all else.

They purposefully create + nurture a single point of failure and are massively rewarded for the harm that causes (while the consequences are suffered by everyone else). Thanks to the pass they reliably get, we get their leadership design degrading every industry it can.

JumpCrisscross · 2024-08-11T17:58:38.000000Z

> If the coding industry mimics the construction industry, we wind up with one position called engineer that assumes most of the liability

If their sign off is required, this could work. The question is whether it’s worth it, and if it is, in which contexts.

WarOnPrivacy · 2024-08-12T13:54:01.000000Z

> If their sign off is required, this could work. The question is whether it’s worth it, and if it is, in which contexts.

Civil engineers liability is tied to standards set by gov agencies/depts and industry consortium.

Standards would have to be created in software engineering - along with the associated gov & industry bodies. In civil engineering, those things grew during/from many decades of need.

throwaheyy · 2024-08-12T03:33:45.000000Z

To be fair, software and technology is so magically transformative that even with warranty disclaimers like “this software comes with no warranty, and we limit our liability to the purchase price”, every company in the world still lines up to buy it. Because for them it’s effectively magic, that they cannot replicate themselves.

No individual software developer, nor corporation, is foolish enough to claim their software is free of bugs, that’s why they put the risk on the customer and the customer still signs on the dotted line and accepts the risks anyway. After all, it’s still way more profitable to have the potentially-faulty software than needing an army of clerks with pen and paper instead.

Most software has to be this way or it would be exorbitantly expensive. That’s the bargain between software developers and the companies that buy the software. Customer accepts the risks and gets to pocket the large profits that the software brings (because of the software’s low cost), because it’s better than the software developer balking at the liability, no software being written at all, and having an army of staff every airport writing out boarding passes by hand. There are only a few softwares that aren’t this way - example the software in aircraft or nuclear power plants. That software is correspondingly extremely expensive. Most customers that can, choose to accept the risks so they can have the larger profits.

wizzwizz4 · 2024-08-11T16:53:40.000000Z

> Software engineering, of course, presents itself as another worthy cause, but that is eyewash: if you carefully read its literature and analyse what its devotees actually do, you will discover that software engineering has accepted as its charter "How to program if you cannot.".

— Edsger Wybe Dijkstra, 1988. (EWD1036)

jfengel · 2024-08-11T16:58:00.000000Z

I'm ok with that. I don't want to keep everyone out except just those who happen to have just the right mind set. Programming is about developing software for people, and the more viewpoints are in the room, the better.

Some pieces are more important than others. Those are the bits that need to be carefully regulated, as if they were bridges. But not everything we build has lives on the line.

If that means we don't get to call ourselves "engineers", I'm good with that. We work with bits, not atoms, and we can develop our own new way of handling that.

wizzwizz4 · 2024-08-11T17:08:33.000000Z

> I don't want to keep everyone out except just those who happen to have just the right mind set.

Neither do I. Neither did Dijkstra. EWD1036, “On the cruelty of really teaching computing science”, is about education reform, to enable those who don't "happen to have just the right mind set" to fully participate in actual, effective programming.

2OEH8eoCRo0 · 2024-08-11T16:58:48.000000Z

I prefer to call it "computer programming." If the title is good enough for Ken Thompson or Don Knuth then it's good enough for me.

WarOnPrivacy · 2024-08-11T18:03:24.000000Z

> If that means we don't get to call ourselves "engineers", I'm good with that.

I suspect this particular title-exaggeration is fueling this particular fire.

Going forward, I believe we need to be aware that software controlled mechanics grew out of two disparate disciplines; it presently lacks the holistic thinking that long-integrated industries do.

wcunning · 2024-08-11T23:19:34.000000Z

Software (controls) engineers at VW during the emissions scandal went to jail, engineers at GM were held liable for the ignition switch issue (not mostly in software, but still). I expect we'll eventually see some engineers/low level managers thrown under the bus with Boeing. It definitely happens, but not as frequently as it could. That said, I definitely prefer Amazon's response to the AWS East 1 outage back in 2016 -- the engineer wasn't blamed, despite the relatively simple screw up, but the processes/procedures were fixed so that it didn't happen again in the last 8 years. Crowdstrike is a little bit gray on that regard -- people should have known how bad the practice of zero testing on config updates was, but then again, I've seen some investigating saying that the initial description wasn't fully accurate, so I'm waiting for the final community after action report before I really pass judgement.

dakiol · 2024-08-11T19:56:10.000000Z

> I appreciate that we’re finding the humour in this catastrophe but what about the question of liability?

One of the biggest and most used piece of software (the Linux kernel) comes with zero warranties. It can fail, and no one would be liable. Are we fine with that? Is the CS case different because it costs money? From an user perspective we don’t want software failing in the middle of an airplane landing, so whether the software comes from CS or github, it’s of lesser importance.

lupire · 2024-08-11T19:15:11.000000Z

How many bridges, would you say, does the average civil engineering firm deliver each year, each on only 1 day notice, in response to a surprise change in requirements due to a newly developed adversarial attack?

Crowdstrike does this constantly.

You could demand the same level of assurance from software, but in exchange, you don't get to fly, because the capacity won't be there

guax · 2024-08-11T20:12:29.000000Z

I would find it more useful if liability here we're attributed to the need to purchase such draconian tools. Certifications that require it and C levels who approve it. We would be better by it.

cqqxo4zV46cp · 2024-08-11T23:41:06.000000Z

Oh Christ. Just drop it. A by all accounts legitimate security function of a product targeted at company-owned endpoints.

Please don’t devolve this conversation into you being upset about not getting admin rights on your work computer or whatever this is about.

Any (esp. larger) org would be criminally negligent to eschew using something like CrowdStrike in order to capitulate to some nerd that thinks that they have ownership over their work equipment.

guax · 2024-08-12T19:38:23.000000Z

Bruh, chill.

I don't give two craps about having admin rights on my work computer. Crowdstrike is bad software and bad way to manage large deployments. They just proved it. I just think that we are also responsible for buying a solution that works that way.

oglop · 2024-08-12T15:18:30.000000Z

On this website you are asking a population that would be responsible for this, so you will likely only get answers about how hard this is to solve and how it’s not software engineers fault and how we need to understand software engineering is not civil engineering and we need to be careful with this analogy and how it’s not our fault! Don’t blame us when things go wrong, but also, give us all the money when things go right.

This is not the place for this question is what I’m saying.

ezoe · 2024-08-12T11:50:57.000000Z

They totally deserved it.

Those who think running third-party closed-source Windows kernel driver(which parse files distributed from Internet in realtime) are good for the security, they must also accept the consequence.

I'm sick of these so-called security consultants who always insist check lists like installing proprietary close-source binary blob Linux kernel module to the system consists of otherwise mostly free softwares except for hardware drivers and think they did their job, or executives who pays a lot of money to these idiotic so-called security consultants.

belter · 2024-08-11T16:50:55.000000Z

https://zlk.com/pslra-1/crowdstrike-lawsuit-submission-form?...

https://www.sauderschelkopf.com/wp-content/uploads/2024/07/A...

MattGaiser · 2024-08-11T16:51:08.000000Z

> Are the licenses so ironclad that customers have no recourse?

Even on Hacker News, there was agreement that CrowdStrike screwed up, but then people also blamed IT staff, Microsoft (even after realizing it was a CrowdStrike issue), and the EU/regulators.

I imagine responsibility of each entity would need far more clarification than it does now.

If you want to define liability, there needs to be a clear line saying who is responsible for what. That doesn’t currently exist in software.

There are also considering how people respond to risk.

Consider how sesame regulation led to most bread having sesame deliberately put into it. Industry responded by guaranteeing contamination.

Crowdstrike and endpoint security firms might respond by saying that only Windows and Mac devices can be secured. Or Microsoft may say that only their solution can provide the requisite security.

tsujamin · 2024-08-11T18:28:58.000000Z

I’m interested in what those who suffered outages as a result of crowdstrike told their insurers with respect to “QA’ing production changes”

It’d be interesting to see if anyone tries to claim the outage as some sort of insurance event only to lose out because they let Crowdstrike roll updates into a highly regulated environment without testing

xyst · 2024-08-11T17:27:07.000000Z

Probably in a decade or so after the AI crash. I have yet to see anything that comes close to “liability” for the digital realm.

US governments and businesses get hacked/infiltrated all the time by foreign adversaries yet we do not declare war. Maybe something happens in the dark or back channels. But we never know.

minkles · 2024-08-11T17:24:52.000000Z

Engineering safety culture is built on piles of bodies and suffering unfortunately. I suspect in software the price of failure is mostly low enough that this motivation will never develop.

ttymck · 2024-08-11T17:56:09.000000Z

> but so far not much in the way of lawsuits

It hasn't been that long? The situation might be that there hasn't been sufficient time to yet gather evidence to commence lawsuits.

barelysapient · 2024-08-12T23:27:39.000000Z

Your barber has more licensing requirements than a senior software engineer.

Regulations have not caught up with developers (yet).

wannacboatmovie · 2024-08-11T18:51:47.000000Z

> Is there any path for software engineers to reach this level of accountability and norms

Potentially controversial stance here, but most software engineers are not engineers. They study computer science, which doesn't include coursework on engineering ethics among other things. I would say that by design they are less prepared to make ethical decisions and take conservative approaches.

Imagine if civil engineers had EULAs for their products. "This bridge has no warranty, implied or otherwise. Cross this bridge AT YOUR OWN RISK. This bridge shall not be used for anything safety critical etc."

gjsman-1000 · 2024-08-11T16:45:09.000000Z

> Is there any path for software engineers to reach this level of accountability and norms of good practice?

Heck, no.

Civil engineering doesn’t change. Gravity is a constant. Physics are constants. If Rome wrote an engineering manual, it would still be quite valid today.

Imagine if we had standardized software engineering in 2003. Do you think the mandatory training about how to make safe ActiveX controls is going to save you? Do you think the mandatory training on how to safely embed a Java applet will protect your bank?

Software is too diverse, too inconsistent, and too rapidly changing to have any chance at standardization. Maybe in several decades when WHATWG hasn’t passed a single new spec into the browser.

(Edit: Also, it’s a fool’s errand, as there are literally hundreds of billions of lines of code running in production at this very moment. If you wrote an onerous engineering spec; there would not be enough programmers or lawyers on earth to rewrite and verify it all, even if given decades. This would result in Google, Apple, etc. basically getting grandfathered in while startups get the burden of following the rules - rules that China, India, and other countries happily won’t be enforcing.)

Anechoic · 2024-08-11T17:17:13.000000Z

Civil engineering doesn’t change. Gravity is a constant. Physics are constants.
Physics may be a constant, but materials and methods are not. There is a reason why ISO/IEC/ICC/ASTM/ANSI/ASME/ASHRAE/DIN/IEEE/etc standards have specific dates associated with them.

If Rome wrote an engineering manual, it would still be quite valid today.*

Considering many engineering standards from a few years ago are no longer valid, this is almost certainly not true.

thaumasiotes · 2024-08-11T20:49:10.000000Z

>> If Rome wrote an engineering manual, it would still be quite valid today.

We have some ancient engineering manuals. A book I read, most likely Brotherhood of Kings, remarked that Mesopotamian engineering manuals are primarily concerned with how many bricks will be required for a given structure.

The manuals are valid today, I guess, but useless. We prefer pipelines to brick aqueducts. Our fortresses are made of different materials and need to defend us from different things.

gjsman-1000 · 2024-08-11T17:20:57.000000Z

That’s only a formality, but reality did not change, and neither did the fact that those standards would still work even if they would be slightly inferior.

figassis · 2024-08-11T17:02:06.000000Z

Physics will still be the same when your faulty software tells an airplane to dive.

jjmarr · 2024-08-11T17:32:48.000000Z

In Canada, we have software and computer engineering programs accredited by the same entity (CEAB) that does civil engineering.

My program is more out of date (Java Server Pages, VHDL) but the school can't lower the quality of their programs. Generally, the standard learning requirements aren't on technology but principles, like learning OOP or whatever else. The CEAB audits student work from all schools in Canada to make sure it meets their requirements.

The culture itself is probably the most important part of the engineering major. They don't round up. If you fail, you fail. And I had a course in 3rd year with a 30% fail rate. Everything's mandatory, so you just have to try again over the summer.

A lot of people drop out because they can't handle the pressure. But the people that stay understand that they can't skip out on stuff they aren't good at.

toast0 · 2024-08-11T18:17:15.000000Z

I've got an ABET accredited Computer Engineering degree from a US school. The only thing it got me in interviews was questions about why not CS.

I did not follow the path to become a licensed Professional Engineer, because a) there was no apparent benefit, b) to my knowledge, none of my colleauges were PEs and I don't know how I would get the necessary work certification.

Maybe there's corners of software where it's useful to become licensed, but not mine.

theideaofcoffee · 2024-08-11T16:59:20.000000Z

There is nothing saying that allowing for some standardization means that we have to be stuck at 2003-levels of state of the art. And actually, yes many engineering disciplines do change, Civil engineering brings in new construction techniques, methods for non-destructive testing, improvements to materials and on and on, but it doesn't do so like the coked-up industry of software does it in such a free-for-all manner. It's a proper engineering discipline because there's the control, testing the best way to do things and rolling that out.

If we (meaning software 'engineers' and I tepidly include myself in that group) had half the self control in introducing insanity like the 10000th new javascript framework to read and write to a database like the 'proper' disciplines do, maybe it would be better because there's less churn. Why does it have to move so fast? Software is diverse and inconsistent and rapidly changing because 'the industry' (coked-out developers chasing the next big hit to their resume to level up) says it should. I just don't agree that we need that amount of change to do things that amount to mutating some data. If the techniques didn't grow beyond what was cool in 2007, or they were held there until the next thing could be evaluated and trained, but the knowledge and process around them did, perhaps we'd be in a better position. I know I certainly wouldn't mind maintaining something that was created in the last decade of the previous millennium knowing it was built with some sort of self-control and discipline in mind, and that the people working on it with me had the same mindset as well.