A400M Airbus Flier crashed because of software issues

alandarev · on May 19, 2015

Software contractor for Airbus and Rolls-royce here.

All safety critical software (every piece of code ran on-board is safety critical the least) in aerospace needs to pass the DO-178 standard [1].

That is far more serious than standard unit tests you are used to in node.js applications. Generally speaking, to develop a piece of code under that standard it takes 20% of time to write the code, and 80% to testing, and enormous amount of documentation (that is optimistic estimation, usually worse).

Quoting speaker from DO-178 training course I attended:

   People often ask us. "How do we know the standard works?"
   We give this answer: "We do not know. But there have been zero crashes due to software issues since introduction"

If this crash confirms the cause to be a software bug, that is something much bigger than an airplane crash - a huge punch to the whole federal aviation administration.

[1] - http://en.wikipedia.org/wiki/DO-178B

on May 19, 2015

[deleted]

Jayschwa · on May 19, 2015

I worked at an avionics contractor for a number of years and that mirrors my experience as well. On Level A projects, the process was at least followed, but it was often under tight stressful deadlines. Testing was frequently off-shored to save money, but often resulted in low quality tests that had to be reworked at the last minute by in-house engineers.

The state on a later Level C & D project was in worse shape: bloated architecture, lots of requirements churn, little to no peer review, one unrealistic deadline after another, and mandatory overtime for large periods. It was not an environment conducive to thoughtfulness or quality. I finally got fed up and quit.

snom320 · on May 19, 2015

Did you work specifically on the engine controllers? Have you considered speaking to the accident board about this?

jonnycowboy · on May 19, 2015

I also work in aerospace software. Following DO178 certainly does not guarantee that there will be no software bugs. The point of DO178 is to follow a process that will _minimize_ the number of bugs by having adequate peer review processes throughout the requirement definition, coding, integration phases, in addition to the testing you mention above. Testing DO178 only tests that the code follows the requirements. If your requirements are fucked, so are you.

gvr · on May 19, 2015

That is certainly one source for error. There are many. Another is that testing does not give you exhaustive coverage of the state space just because each branch of the code is visited.

The standard does mention "formal verification/methods/proof" but to my knowledge it's rarely been used extensively.

wyldfire · on May 21, 2015

Mathworks makes a sat solver kinda product that does those kinds of proofs. Pretty interesting concept, IMO.

userbinator · on May 19, 2015

The target catastrophic failure rate there is 1 in a billion hours, which doesn't seem all that high to me... there are over 100000 flights a day[1], and the average length of one is well over an hour, so in one day all the aircraft in the world have accumulated a total of over 2.4 million hours in operation. If each flight was only an hour long, that's 417 days to 1 billion total hours, and if the failure rate really was 1 in a billion we'd expect to see one of these happen at almost yearly rates.

The actual failure rate of software to this standard seems to be at least two orders of magnitude higher.

[1] http://www.garfors.com/2014/06/100000-flights-day.html

peeters · on May 19, 2015

Where is the "actual failure rate" number coming from?

branchan · on May 19, 2015

How does 100k flights a day equal 2.4 millions hours of operation if each flight was on average 1 hour?

grkvlt · on May 21, 2015

Yeah, confused. If we assume a two hour average flight, we get 365 * 2 * 100K = 73M hours/year so one failure per 1G hours is one failure in 1000 / 73 = ~14 years.

calebm · on May 19, 2015

I wrote (non-critical) software for the A400M. I don't think that standards like DO-178B necessarily lead to higher quality code; my experience was that 80-90% of time was spent doing documentation, testing, and in general, trying to prove that the software was going to work right, leaving the engineers with very little time to write the actual software...

Jtsummers · on May 19, 2015

What processes did you and your coworkers use for writing up your requirements, design, test procedures, etc. (Basically, the DO-178B/C artifacts.) IME, too many times groups use Word and Excel, and collaborate via email. Version control is, "I think CM has received my latest revision". Using better tools (requirements management software) would eliminate a lot of this overhead. At a former employer we were moving to DOORS on all new projects. They were able to spend much more time on development and analysis and less time on documenting and checking documentation because the process was smoother. They weren't editing 1k page documents, they were editing a collaborative document backed by a database, not unlike a wiki. I could edit one section, and you could edit another at the same time. No risk of merge conflicts, no risk of your update replacing mine.

crististm · on May 19, 2015

DOORS is just a glorified database with a text editor and schema for tracking change requirements. I found it lacking even at this.

Jtsummers · on May 19, 2015

Not saying it was great, but it was a hell of a lot better than Word + Excel. Probably a lot of better tools out there, but it's been a while since I've needed to use them so I'm no longer familiar with what's available.

What made it better:

Concurrent editing. Only one person at a time was modifying any section, but two or more sections could be edited by a group of people.

Version control baked in. VC is awesome, reducing friction means that it actually gets used. Checking in/out sections was just part of the process.

Automatic traceability matrices. These are not easy to make by hand. That way is error prone and tedious. It's also hard to verify. With DOORS we were able to generate these automatically. Verifying them was relatively painless because we weren't flipping through several 1k page documents to make sure things actually matched, we could easily skip to only the applicable parts of any document. The main error that it didn't reduce was when a requirement didn't get linked to every test case or design feature that it should've been linked to.

Again, probably better tools out there than DOORS (at the time, or hopefully by now). But it's a lot better than what most offices do with either post hoc document generation or ad hoc generation with Word + Excel.

sliverstorm · on May 19, 2015

in general, trying to prove that the software was going to work right, leaving the engineers with very little time to write the actual software...

Sure sounds like the way it should be to me. Maybe it needs to be made easier to prove your software is correct, but to me it seems like for systems like airplanes, code that cannot be proven correct is worthless.

Considering most programmers are said to produce 6 lines of good code a day, maybe it's not even actually slower in the end if the formal verification process filters out every other line you would have written that day.

calebm · on May 20, 2015

It's not a bad thing to put in quality assurance measures in place. It's a bad thing, though, if so much time is spent on QA, that there you are rushing to write the actual code. Rushed coding does not produce quality.

emp_zealoth · on May 19, 2015

Every time i read about stuff like that i get mad. Attitude regarding software is just criminal. You wouldn't let internt/inexperienced people design mechanical parts of the plane. Engineers are bound by standards regarding everything. Materials, mechanical couplings, documentation. when i design a machine, i can't scribble it on a napkin and ship it. Why is it acceptable for idiots to write messy, unintelligible code? When i create a mechanical drawing, i am bound by rules how to draw it. With coding everyone has their own ideas. When i design a machine i have to make sure it will work, period. Calculations to be done no matter how i don't have to spend thousand of man-hours fiddling with it to see if it works as intended

the apologetic attitude needs to stop. programs are just complicated machines and should be treated as such not some mystical voodo that will be broken no matter what

tormeh · on May 19, 2015

I think you're glorifying engineers here. They're a lot more careful, but they don't have some fairy pixie dust of rigor that makes everything work perfectly.

belorn · on May 19, 2015

How does liability work, and do Airbus have access to the source code? I assume that if a "bug" was found in the architectural blueprint for a plane it would be the manufacturer who in the end is responsible for not finding it, but they would naturally have full access to the blueprint when making the plane. Is the same true when developing the custom software for the plane?

luch · on May 19, 2015

Well it's military airplane so it's a bit special here, usually it's the costumer (i.e. the army) which is liable if the plane has passed the final tests.

However it's a complete different matter for civil airplanes : it's the lead dev/project manager which is liable for life for what he has shipped. For example, a retired engineer from Airbus was heard in trial for the Concorde accident in France in 2000.

TeeWEE · on May 19, 2015

I'm quite baffled at the fact that this standard does not include formal verification of the software models used.

Formal verification and state space analysis can prove that the software "model" will not fail. State space exploration of the actual implementation is actually often not feasible due to the enourmous amount of states.

So my question: Are you doing formal analysis of the software models/designs? I know they are using it in software used in space crafts / nuclear power plants.

Reference: - http://ti.arc.nasa.gov/m/profile/dimitra/euromicro-share.pdf - http://javapathfinder.sourceforge.net/

sfrank · on May 19, 2015

The DO-178B/C explicitly does not specify how verification is done but describes the properties that are expected of the verification evidences that you submit to the certification authorities; so formal verification is perfectly fine as one item on your verification check list. In particular DO-333 amends DO-178C with specific topics concerning formal methods.

For example Astree [1] has been developed for decades now, with Airbus as one of the major sponsors.

[1] http://www.astree.ens.fr/

tedunangst · on May 19, 2015

What is the largest piece of code you have formally verified?

bglazer · on May 19, 2015

The seL4 mircrokernel is comprehensively formally verified and is marketed towards aerospace applications.

https://sel4.systems/

castell · on May 19, 2015

Which programming language is commonly used there? Ada, C, C++, JOVIAL, Asm?

shepardrtc · on May 19, 2015

Nowadays, its C/C++. In the past it was Ada (at least in the US). Ada was created specifically for this sort of thing (and was actually a requirement for certain projects), but it never gained popularity so companies would just petition the government to use C/C++ instead. Its a shame, too. Ada is fantastic if you want to avoid bugs. I wish more people would give it a chance.

On the other hand, I also wonder why people don't think of using Erlang for something like this. The VM is designed for a ridiculous amount of uptime, it has a supervisor tree that can monitor and restart failed processes, and it can interface with C/C++. A rock-solid VM should be running and monitoring life-safety systems.

Jtsummers · on May 19, 2015

Erlang's model is entirely appropriate, but the language and VM aren't. This code is often running on small embedded chips (so you'd need to port the VM) and the software has hard real-time requirements, which the Erlang VM is not (currently) set up to handle, nor would it necessarily be able to achieve on the commonly used processors. Another strike against the language (as much as I love it) is that it's dynamically typed. That's less appropriate for this sort of software. There are static analysis tools for Erlang that mitigate this, but it's still an issue. Large classes of bugs and errors can be eliminated or minimized with statically typed languages or with static analysis tools if they're well integrated into the build process. A real-time, statically typed language with Erlang's semantics and compiled to native binaries would be a boon, however.

I'll speak to the 787 avionics system. It used a system of channels/buffers and processes very much like what Erlang and Go use for interprocess communication (I'm trying to remember now if channels could be received in multiple processes like Go or if only a single process could receive like in Erlang). This was an excellent model for what we were doing, and really for a lot of systems this sort of CSP and actor style model maps well.

tormeh · on May 19, 2015

Using C/C++ on these projects is idiotic. If Airbus used C/C++ then they deserve all the financial loss in the world. They have blood on their hands.

lambdaelite · on May 19, 2015

Keep in mind they don't use C/C++. They use C/C++ with a coding standard (like MISRA), static analysis tools, validated compilers, development processes incorporating change control, documentation, verification and validation, etc.

What alternative are you suggesting?

progman · on May 19, 2015

I know that Ada compilers are 100% verified correct but is there really any validated C++ compiler? Which one?

AFAIK (partial) assurance in C/C++ can only be handled by additional testing tools, Frama-C for instance.

I agree that C/C++ should not be used for security applications. Ada is a much better choice because it was designed for security.

lambdaelite · on May 19, 2015

Not sure if I'm understanding your question correctly, but Wind River claims their Diab compiler is validated by TÜV NORD and is has been used for stuff up to SIL4.

In fact, they (http://windriver.com/products/product-overviews/PO_Diab_Comp...) say:

  Diab Compiler has been a reliable code generation tool for
  avionics products certified for DO-178B, products for the
  nuclear market certified to IEC 60880, railway applications
  certified to EN 50128, and industrial products certified
  to IEC 61508, and is now qualified for use in automotive
  applications certified to ISO 26262.

Ada does have some built-in advantages, but I think my point still stands: the language is a small part of the entire SDLC, and I don't think it's the most important part.

castell · on May 20, 2015

Is TÜV Nord/Sued known for extensive software checking?

CHY872 · on May 19, 2015

Are you sure that Ada compilers are verified correct?

I'm pretty sure that the only industrial formally verified compiler is CompCert (for C), though I could be wrong. The motivation for CompCert was certainly that Airbus wanted such a compiler.

Ada wouldn't be a better choice simply because it's designed for security. It'd be a better choice if it turned out better in practice. I've read some of the studies that have been done, and I haven't found them convincing.

Requiring additional tools just isn't a problem, if it works well. Don't criticise the process, criticise the result.

tormeh · on May 19, 2015

Ada, of course. Are you saying it wasn't at the forefront of your mind? If I sound patronizing it's because I am.

lambdaelite · on May 19, 2015

As I said elsewhere, Ada does have some very nice built-in advantages, but I think my point stands: the language is a small part of the entire SDLC, and I don't think it's the most important part.

The focus on languages instead of the SDLC is telling, I think.

jcadam · on May 19, 2015

Really, you'd want to use Ada for something like this (The language has survived specifically because it's managed to carve out a niche for itself in aerospace).

I once worked on satellite systems using Ada95 and Ada2005 (Ada is definitely not dead). The language is a pain to use but is impressive in that it catches more crap at compile-time than anything else I've seen.

progman · on May 19, 2015

> Ada is definitely not dead

True. More information: http://www.ada2012.org

lamby · on May 19, 2015

With that kind of coding-to-testing-and-documentation ratio, does it even matter?

TheLoneWolfling · on May 19, 2015

Yes, definitely.

For a quick example, there are many languages where you cannot accidentally run off the end/start of an array, barring a compiler error.

With a language like C / C++, it's possible. Not probable, given that sort of testing. But possible nonetheless.

Some languages are also easier to test than others, partially because of this, partially because of other issues. For instance, in some languages you can guarantee at the language level (again, barring compiler errors) that something won't be modified. (Like const, but actually working.)

gsnedders · on May 19, 2015

Yes. The more the type-system encodes the more the type-checker (in the compiler) proves for you. No matter how much testing you have, nothing will ever come close to proving properties.

UnoriginalGuy · on May 19, 2015

Yes. Some languages are better at scoping than others. Functional languages are easier to test because they have more confined scope.

tjbiddle · on May 19, 2015

Then to add to it, what are tests written in?

agumonkey · on May 19, 2015

I've seen this first hand, unfortunately this doesn't escape the hiring market reality. Many documents and source files are littered with (very) bad code from an 3 months employee that couldn't do better before leaving considering the extreme size of the project.

I wonder if someone can pull an #ElonMusk on the DO-178 to slim things down in order to have better control.

ps: planes fly with bugs, see the DreamLiner, Airbus ones aren't free from them either, employees know this. (I guess they travel by train)

ExpiredLink · on May 19, 2015

> I guess they travel by train

The train brake control systems are written in ... C.

agumonkey · on May 19, 2015

Yeah but usually when two trains collide, you don't have the time to see it coming.

smcleod · on May 19, 2015

Is the code is available for review outside the operating constraints of the system that designed it - I.e. Is it open source?

mhandley · on May 19, 2015

Here's Airbus's statement on the Alert they sent to operators:

Airbus Defence and Space has today (Tuesday 19 May) sent an Alert Operator Transmission (AOT) to all operators of the A400M informing them about specific checks to be performed on the fleet.

To avoid potential risks in any future flights, Airbus Defence and Space has informed the operators about necessary actions to take. In addition, these results have immediately been shared with the official investigation team.

The AOT requires Operators to perform one-time specific checks of the Electronic Control Units (ECU) on each of the aircraft's engines before next flight and introduces additional detailed checks to be carried out in the event of any subsequent engine or ECU replacement.

This AOT results from Airbus Defence and Space's internal analysis and is issued as part of the Continued Airworthiness activities, independently from the on-going Official investigation.

They're asking for a one-time check to be performed on the ECU. If it's just a software bug, normally a one-time check wouldn't reveal whether or not that bug could trigger. So, obviously they've found something they're concerned about, but it seems to me to be a bit early to say as Spiegel Online do that software caused the crash.

In any event, one of the flight recorders was only just sent off to the manufacturer: http://economictimes.indiatimes.com/news/international/world...

Nitramp · on May 19, 2015

> So, obviously they've found something they're concerned about, but it seems to me to be a bit early to say as Spiegel Online do that software caused the crash.

I would assume the Spiegel report is based on more sources than just Airbus' public statement. At least most of the time, Der Spiegel is a real newspaper, not just some blog spouting stuff based on conjecture from a single source. They mention internal sources in multiple places in the article.

JshWright · on May 19, 2015

Or it seems like the problem may have been a known (or suspected) issue with an old firmware, and the check is to make sure the firmware is above a certain version (which would also explain why the check would be necessary on any replacement ECUs).

mhandley · on May 19, 2015

The A400M that crashed was on its first test flight, so unless they've done something very odd with versioning, it's unlikely that all its ECUs had older firmware than planes that already shipped. Besides, with aircraft, all changes are logged with a ton of paperwork, so they shouldn't need to check the aircraft to know what firmware they're running.

DougWebb · on May 19, 2015

It doesn't seem so unlikely to me. Supply chains are long and parts are bought in bulk. I think it's likely that the parts for the ECUs, including boards with chips containing the older firmware, are warehoused where the ECUs are assembled. The ECUs are probably then bought in bulk and warehoused where the planes are assembled. The paperwork for the plane probably includes the ECU serial numbers, but it probably doesn't include serial numbers for all of the components installed on all of the boards inside the ECU, especially if they didn't have the foresight to think that those numbers would matter. Afterall, it seems there's a way to get the firmware version by querying the ECU.

JoeAltmaier · on May 19, 2015

I think you underestimate the tracking done in commercial airplane manufacture. I believe the source of every rivet is well-known. And how many planes do they make? Many parts are likely manufactured as needed, one at a time.

DougWebb · on May 19, 2015

I'm sure the tracking is a lot more detailed than, say, consumer products, but it's probably not as detailed as it is for space hardware. There's a cost to that tracking, and the commercial airplane industry needs to be cost-competitive. So I'm assuming they track well enough to meet their own and government-imposed safety standards, but perhaps not well enough to be able to look up the chip firmware for each circuit board on every in-service airplane in a database.

As far as JIT manufacturing, that's certainly the case for bigger parts and systems, but it sounds like the ECTs are replaceable so they probably have backup replacements stocked near most major airports. (Assuming it's a part that can be replaced during routine pre-flight maintenance.) And the components that go into the ECTs are almost certainly produced in large batches rather than continuously.

rurounijones · on May 19, 2015

Watching too many air-crash investigation episodes had lead me to believe 0% of media reported "facts" surrounding plane crashes.

I will wait for the official accident investigation report.

madez · on May 19, 2015

This is a very sound decision.

The media is infamous for not getting facts straight. They much rather write an opinion based on suspicion.

Well, no, thank you. Give me facts, get them straight, then I will make up my opinion.

minwcnt5 · on May 19, 2015

Related: the Gell-Mann Amnesia effect.

madez · on May 19, 2015

The Gell-Mann Amnesia effect seems to be a specific version of something more general: cognitive dissonance. The fact that the media is not to trust is not forgotten, it is simply not evaluated and acted upon. It sits there in your brain until you sceptically reflect on what you think to know and how you act. Some people do that to some degree, most less so.

We have had a long time to recognize that our brains don't work well. It is time to accept the facts.

rkangel · on May 19, 2015

Then stick with a news agency rather than journalists. The information from somewhere like Reuters is carefully presented as pure content, no opinion, and probably has a greater chance of being true.

randomname2 · on May 19, 2015

This is laughably false. While staying away from outright lying, Reuters very often adds their own spin to their reporting.

_3u10 · on May 19, 2015

Especially when it's their own journalists getting taken out in 'classified' videos.

leejoramo · on May 19, 2015

As a kid in the 1980's with a 747 captain for a father, I followed plane crashes with great interest. The difference between the main stream media and what was in Aviation Week's coverage was shocking. Even with the release of the official report, Main Stream media would continue to mis-report, where as Aviation Week would reprint the majority of the report.

SixSigma · on May 19, 2015

it isn't just plane crashes where that applies, tbh pretty much every subject

efdee · on May 19, 2015

I will wait for the episode to air on NG.

inetsee · on May 19, 2015

I worked on software for the C-130J military cargo plane. It was before my time, but an earlier model aircraft crashed during a test flight. The crash occurred shortly after take off, and the entire crew was lost.

There is a critical time period during a take off when the aircraft is at maximum risk. If an engine fails before rotation (i.e. before the wheels leave the ground) an alert crew can stand on the brakes and use thrust reversers. The aircraft may get dinged up, but there is a reasonable chance the crew (and passengers) will survive.

If there is an engine failure after rotation but before the aircraft has gained sufficient altitude, unless there's a big, flat field next to the runway a crash is almost inevitable.

When an aircraft turns, it will lose altitude unless the crew compensates by adding power. An aircraft without power and sufficient altitude cannot make the turns necessary to go all the way around to land on the runway they just left.

sokoloff · on May 19, 2015

Your post is substantially correct, with a clarification on rotation speed(Vr) vs takeoff decision speed (V1).

There are three relevant speeds for large aircraft. (I'm going to generalize very slightly to keep this short and readable.)

V1, Vr, V2.

V1 is the takeoff decision speed. An engine failure recognized before reaching V1 is handled by aborting the takeoff. An engine failure recognized after reaching V1 is handled by continuing the takeoff. At the V1 callout, the pilot flying removes their hands from the top of the throttles (as a physical reminder that aborting/rejecting the takeoff(RTO) is not happening for a simple engine failure).

Vr is the rotation speed, where the nose wheel is lifted from the ground.

V2 is the speed at which the airplane will climb safely with one engine INOP.

In most cases, V1 is the lowest speed, meaning there are cases (between V1 and Vr) where an engine out with the nosewheel on the ground results in continued acceleration, then rotation, and flight.

It's a checkride bust to RTO above V1 for a simple engine failure.

inetsee · on May 19, 2015

Thank you for clarifying my post. I am not a pilot. I am a software engineer (retired). I worked briefly on the C-130J mission computer operating system, then on the maintenance software the ground crews used to maintain the aircraft.

I did not know that there was a situation when a flight crew would continue a take off after an engine failure but before rotation.

tirant · on May 19, 2015

I am not surprised at all. There are a lot of contractors involved in the development of the software for the A400M, and they are basically competing for price and employing undergraduates making below €18K/year, which they replace every few months due to burnouts and bad working conditions. Projects get continuously delayed, and key people barely stay more than a couple of years.

bottled_poe · on May 19, 2015

Have you been involved with the engineering of aircraft control systems, or are you just speculating?

mdaverveldt · on May 19, 2015

I can confirm similar situations for other EU IT projects. Currently, I am a contractor working on the IT side for the Galileo Satellite System. Although the IT consulting company that I work for pays significantly better then the 18K mentioned (25K-30K range) the quality of the work that is delivered is terrible. The mentioned salary is for graduate engineers.

The attrition rate is terrible and its impossible to keep people who have knowledge of the program on board.

I would guess that 30-50% of our team are inexperienced java software engineers with less then 5 years of working experience. Around 100 people work on our project but there are probably 3-5 people who still have a clear picture of how everything is working / supposed to work.

scrollaway · on May 19, 2015

Uh, there's a massive difference between contracting on satellite systems and contracting on civil aviation systems.

colechristensen · on May 19, 2015

In America such things are more often done by DoD contractors and they are paid well and space systems are more tightly controlled than civil aviation.

tirant · on May 19, 2015

I was not speculating. I've been working in one of the contractors. The same situation applies to Railway Control Systems and other Telecommunication Infrastructures.

agumonkey · on May 19, 2015

It confirms what I've seen https://news.ycombinator.com/item?id=9569636

yitchelle · on May 19, 2015

> ...and they are basically competing for price...

and if the contractors are from India, the competition does not stop after the contract have been awarded. During the execution of the project, the competitive behavior will continue. One contracting firm will compete with an aim to reduced the tasks/responsibility of the other contracting firm, thus (or hoping to) capturing more contracting hours. Very vicious and dangerous..

jclure · on May 20, 2015

This would not be the case if they invested in automated (Unit Test) tools which would boost productivity dramatically and allow automated regression testing throughout the project life cycle.

hydrogen18 · on May 19, 2015

I thought that 'lowest bidder wins' was an American phenomenon? Certainly, Boeing and others have figured out how to game the system here in the states and abroad.

nier · on May 19, 2015

My german architect clients say: “Ze lowest bidder wins to start ze project with ze highest bidder’s plans.” Which is funny when you squint enough to ignore the tragedy.

VMG · on May 19, 2015

> I thought that 'lowest bidder wins' was an American phenomenon?

Why should it be?

happyscrappy · on May 19, 2015

Because Europe cares about people not money, like in Greece.

jmnicolas · on May 20, 2015

I'm European ... and I upvoted you. They sold us the EU as a big brotherhood of the European people, instead we got a giant bureaucracy at the service of big corpo.

madez · on May 19, 2015

Oh, the irony, bitter - just like medicine.

driverdan · on May 19, 2015

Source?

pfortuny · on May 19, 2015

Looks like the source is himself.

lexy0202 · on May 19, 2015

I have done a quick and rough translation of the German article into English. Hopefully this is better than the Google Translate version: https://gist.github.com/alexcoplan/0018e3320f99a612c737

BuildTheRobots · on May 19, 2015

Much more readable than the Glenglish version -thank you.

I still don't understand "The crash is the worst accident since the development of the A400M", though. To me that implies there was a pretty devastating accident during the development.

bhaak · on May 19, 2015

This is not a translation error. This sentence has the same puzzling meaning in the German version.

I think it's sloppy journalist writing for the worst accident that happened with the development of the A400M.

tedunangst · on May 19, 2015

That may be reading too much. Another reading is that Prior to development, the plane didn't exist.

bhaak · on May 19, 2015

There are not many details about why exactly the three engines stopped working and it's not yet officially announced. This article has been written with "information Spiegel Online received".

Two translated quotes: "The investigation yield a clear result: Shortly after the lift-off of the test machine, the computers send conflicting commands to the three engines which then powered off."

"Soon after the crash, experts of the German Air Force suspected a software issue with the fuel supply unit because such a fatal drop of power so soon after the start could hardly be explained differently."

So, not much information why the computers sent conflicting commands and also why the engines power down in such a situation.

lucaspiller · on May 19, 2015

> So, not much information why the computers sent conflicting commands and also why the engines power down in such a situation.

I think shutting down the engines is probably the safest option when this sort of thing happens. You could argue they should stay in the present setting, but what would happen if one engine were at 0% and another 100%?

Most aircraft are pretty good at gliding even without power, and I'd assume a deadstick landing is part of the pilots training. In 2001, TS236 flew unpowered for 19 minutes before making an emergency landing (on a runway) with only minor injuries:

http://en.wikipedia.org/wiki/Air_Transat_Flight_236

fransan · on May 19, 2015

> I think shutting down the engines is probably the safest option when this sort of thing happens. You could argue they should stay in the present setting, but...

Shutting down an engine should always be a decision made by the pilot not a machine IMO. A pilot might prefer to blow out an engine if that gives him enough ( even a couple of seconds matter in this situation ) time to reach a save landing spot or avoid an obstacle.

> what would happen if one engine were at 0% and another 100%?

Pilots are train in this situation all the time and is part of the syllabus for a multi-engine rating. Basically the plane would try to turn to the side that the engine failed. Pilot will use opposite ruder and aileron to compensate while cutting back power in the good engine to just enough you can keep altitude if possible.

> Most aircraft are pretty good at gliding even without power, and I'd assume a deadstick landing is part of the pilots training.

So happen here too. The crew tried tried a dead-stick landing in a field when they realized the could not make the airport. Unfortunately the hit a High Power pole and the plane catch fire.

One of the crew members lost in the accident was a friend of my father. My condolence to the families of those who lost their live that day.

VLM · on May 19, 2015

"but what would happen if one engine were at 0% and another 100%"

Part of earning your multi-engine cert is memorizing all manner of different airspeed limits for that kind of situation. If you want to maintain yaw control with one engine feathered (er, shutdown) and the other at full throttle you must be going faster than X knots or whatever. Below that indicated airspeed you pull back on the throttle or you're going into a turn at best or more likely a spin.

sokoloff · on May 19, 2015

In fact, there is a light blue radial line painted (or displayed) on the airspeed indicator for exactly this situation, so the crew doesn't have to remember the value in a high-stress situation.

tim333 · on May 19, 2015

>what would happen if one engine were at 0% and another 100%

Planes are designed to fly fine in that situation - it's what you get if one engine breaks down. Now landing the thing with one engine stuck on 100% would be interesting. I guess you could kill the engine somehow - turn off the fuel or pull the fuses.

TheLoneWolfling · on May 19, 2015

An engine stuck on has happened before, though I believe this was only flight idle:

https://en.wikipedia.org/wiki/Qantas_Flight_32

Mind you, that was the result of an uncontained engine failure.

jonnycowboy · on May 19, 2015

Planes with four engines are certainly not designed to fly with 3 of 4 engines out...

Piskvorrr · on May 19, 2015

Indeed, planes are not designed to be flown in that configuration normally, but they are certainly capable of single-engine flight in emergency situations (where the alternative would be "flying like a ton of bricks"). Exhibit A: http://en.wikipedia.org/wiki/British_Airways_Flight_9

jonnycowboy · on May 20, 2015

No, actually "capable of flying" means capable of performing all flight maneuvers required for safe flight and landing, including takeoff, go-around and approach/landing. BA Flight 9 failed all four engines, normally not survivable - but managed to restart all four engines by windmilling. So they recovered in time for landing.

fransan · on May 20, 2015

Just curious, where did you find that definition of "capable of flying"? Because under that definition a glider ( unless it is a motorize glider ) is not "capable of flying".

Piskvorrr · on May 20, 2015

Okay, under that definition, you are correct. Note that the flight later lost engine #2 again (incidentally making the record of five engine failures on a single flight), so it landed with 3 engines operational.

Piskvorrr · on May 19, 2015

Pretty good at gliding from a high flight level. If you are hitting power poles, you are way too low for gliding an usually-powered aircraft. TS236 and Gimli Glider were exceptionally lucky, IMNSHO.

traufetterg · on May 19, 2015

Dear Engup, I have read your thread that is unfortunately deleted. I am the author of the Spiegel-Story on A400M http://www.spiegel.de/politik/ausland/airbus-a400m-militaerm... I would like to get into touch with you. gerald_traufetter@spiegel.de Best Gerald

smcleod · on May 19, 2015

I wonder if this is the time to argue that it may be worth open sourcing the controlling software for hackers to start criticising and contributing pull requests to. I'm willing to bet that the competence of the collective community far outweighs that of those specially trained to write the software at present.

What is there to lose by opening up the software to criticism other than better aviation safety? We know that obfuscating / hiding source code does not make applications / platforms safer or less at risk to malicious behaviour so I'd like to challenge the manufacturers to do so.

sleeping_pills · on May 19, 2015

While I agree with your statement that open sourcing code can help with improving it's quality, how exactly do you envision (paraphrasing) "hackers contributing pull requests" to code that controls engines on an airplane? Here you have an extremely specialized codebase which can perhaps be understood by a tiny group of professionals and it can actually be tested by an absolutely vanishingly small group of individuals under special circumstances. I don't think "hackers" could even begin to make useful critique of this kind of software, let alone contribute pull requests to it.

nraynaud · on May 19, 2015

I agree with your conclusion, but not exactly for the same reason. Those projects are huge, there is never really small group of people working on something, it's often spread upon contractors over sub-contractors, with people leaving and coming, over many years, so in the end the information is quite spread. Plus there is a lot of documentation (which might also be a downside, because there is a hunt for relevant information). But what I would fear is that people would feel good about it being open source, and never go to have an actual look at it. Or go for a little bit in the beginning and then never again.

madez · on May 19, 2015

You sound exactly like my current boss, who says that Linux is just a hobby project that you can put no trust in.

I mean, who would fix problems if it's just a hobby? And if it's open, it must be a hobby. Surely, that can't possibly work!

sleeping_pills · on May 19, 2015

Did you actually read my post? If yes, can you seriously not see the difference between writing/testing an OS and writing/testing the software that controls jet engines?

Open sourcing something like Linux works very well _precisely_ because it has a very large audience and is (relatively) approachable by hobbyists too.

On the other hand, aerospace engineering and software is narrowly specialized with a (relatively) small group of experts and code used in commercial/military aircraft is anything but approachable to hobbyists.

Then there is the fact that unit testing this kind of code requires engineering knowledge of the specific hardware involved (e.g. not just any jet engine, but one very specific model). Finally, let us not even mention the huge pink elephant in the room, namely that the absolute and vast majority of "hackers" does not have access to jet engines used in commercial (or military) airplanes and even fewer have the ability to conduct test flights.

DougWebb · on May 19, 2015

What is there to lose? 0-Day attacks. Knowledge about bugs in aviation software is potentially more valuable to people who wish to do harm than to the people who would fix the bugs, so there's a concern that someone who finds a bug will sell that info rather than let the maintainers know about it.

The other problem is that the maintainers have to be set up to handle a potential avalanche of comments, criticisms, questions, and pull requests, mostly from people who don't know anything about software development processes and standards within the aviation community. If they're already too overloaded to find all of the bugs themselves, they certainly won't be able to effectively manage open-sourcing their code.

mason240 · on May 19, 2015

Realistically, how could someone exploit a 0-day in a aviation software?

DougWebb · on May 19, 2015

I don't know. I wouldn't want to find out. But more realistically, a good bug that's worth a lot of money will be subtle and hard to find, which means it may be around long enough to be exploitable.

nradov · on May 19, 2015

This is a military aircraft. By opening up their software, Airbus would lose by enabling their foreign competitors and enemies to copy parts of it thus reducing their R&D costs and gaining a competitive and strategic advantage.

geon · on May 19, 2015

Percieved commercial advantage and pride.

PhantomGremlin · on May 20, 2015

Aircraft manufacturers and operators go through great amounts of effort to avoid single points of failure. E.g. on a twin they overhaul the two engines at different times.

But this is different. I wonder if they need to rethink their approach to software? Four engines, running the same software --> single point of failure.

sean-duffy · on May 19, 2015

Wow, this comes as a shock! Last year I saw the A400M appear at the Royal International Air Tattoo and was very impressed, as were many others. Hopefully they'll find the problem and this won't be too much of a blot on the development of this aircraft.

tim333 · on May 19, 2015

From another report:

"Problems in developing the engines, and particularly in certifying the engine control software, contributed to three years of delays and a new cash injection by governments in 2010."

Seems like they had some issues. Surprising it's that difficult.

http://www.reuters.com/article/2015/05/19/airbus-a400m-idUSL...

userbinator · on May 19, 2015

I wonder if this is what lead to Airbus making a pretty vague statement compared to Boeing about its software on this other recent related news:

http://www.bloomberg.com/news/articles/2015-05-18/hacker-cla...

tntcl · on May 19, 2015

So basically the software bug happened, because there are quality problems in the manufacturing street? wtf!?!?

VLM · on May 19, 2015

Perhaps a mental model abstracted from aerospace is you have a somewhat complicated PID controller and it was programmed to tolerate widgets with size of 0.01 to 0.02 because only those can physically fit on the assembly line, and within those limits it is proven and tested to be unconditionally stable and predictable and correct under all operating conditions.

Unfortunately manufacturing let a batch of 0.0201 size slip thru inspection, they just barely fit on the assembly line despite being out of spec too large, and the PID controller makes the system go into oscillation and explode because its outside its theoretical limits of whats possible.

The most insidious spec violations are "manufactured too well" if for example you rely on frictional damping to eliminate oscillations, then making and shipping better ball bearings than you'd ever shipped before, could ruin an overall system because one component is too good. Possibly the "error" is something is too smooth, too straight, too flat, or too low friction. No one ever expects those to cause a disaster, but it can happen.

That would be an example of a disaster involving software, that can be fixed in software, although it wasn't caused by the software, it was caused by bad control system engineering design work. Also this abstract example probably has nothing to do with the real problem, although the "widget" and "size" could very well be something line fuel line tubing inside diameter.

madez · on May 19, 2015

The original speaks of quality problems in the manufacturing plant. This is very general and could include problems in the design phase up to the flashing of the module after assembly of the plane.

tremon · on May 19, 2015

I think the OP was reacting to the use of the term "quality problem" to describe a catastrophic failure leading to loss of human life. At least, that was my response when reading that quote.

zurn · on May 19, 2015

Anyone know how much effort Google is putting into improving Translate? It feels like it hasn't gotten much better over the years even with these first-tier language pairs.

crististm · on May 19, 2015

Effort has little to do with it. If you don't have an angle of attack, you can't solve the problem. Humans have years of experience to put a text into context and interpret it.

zurn · on May 20, 2015

That would be a bad excuse for not improving even if GT matched the state of the art. But it's doesn't for many/most languages.