One of the SpaceX engines came apart during launch

stcredzero · on Oct 8, 2012

> We know the engine did not explode, because we continued to receive data from it.

Uh, they say the engine did not explode. The failed engine shut down and vented gasses ruptured the engine fairing. How about someone change the inaccurate headline? "That smooth SpaceX launch? Turns out one of the engines exploded"

parfe · on Oct 8, 2012

How about a "spontaneous self-initiated partial disassembly event"?

I watched the video and don't see why a burst of flames and debris would not be described as an explosion, other than marketing.

jlgreco · on Oct 8, 2012

Ah yes, you have got to love rocket scientist terminology. The wikipedia page for the Vanguard rocket used to have an image of one of them exploding captioned with " Vanguard rocket undergoing rapid unplanned disassembly shortly after launch at Cape Canaveral".

DannoHung · on Oct 8, 2012

Already corrected.

jlgreco · on Oct 8, 2012

Sadly yeah. Somebody changed it a few days ago; it was like that for a long while previously though.

stcredzero · on Oct 8, 2012

Yes, but the point is that the engine didn't explode. The fairing flew apart, not the engine. That's a big difference.

anovikov · on Oct 8, 2012

I don't know how do you come to conclusion that there was any kind of mechanical damage anywhere. There was most certainly none. The engine simply shut down, most probably because one of the many possible abort criteria happened, e.g. some of the controlled variables of engine, like pressure, flow, temperature of fluids/gases in many places, or turbopump RPM, etc. exited the permitted corridor of safety. It can be due to something as simple as a faulty sensor indicating a false reading, or the condition being too strict (both happened to SpaceX before). Then the engine was shut down, and the lower temp plume of the engine which being shut down appeared to you like 'smoke of explosion' on video.

I am in fact near certain that the engine remained mechanically fine after shutdown and nothing else exploded or was broken. They will figure this out for the next flight.

anovikov · on Oct 8, 2012

Yeah now i see from the slow-mo that the aerodynamic shell got ruptured, but it's no big deal, it's obviously not designed to handle any loads from INSIDE.

priv_acy · on Oct 8, 2012

They said the engine fairing ruptured. That is most definitely mechanical damage.

jlgreco · on Oct 8, 2012

If spacex says there is an engine fairing, I'm going to say there is one.

InclinedPlane · on Oct 8, 2012

The headline has been changed though the URL is still the same (which is expected given the way most CMSes work).

cameldrv · on Oct 8, 2012

The engine is designed to hold high pressures, but it ruptured, and then the rapidly expanding gasses tore off other parts of the engine. It didn't disintegrate, but that sure sounds like an explosion to me.

pyre · on Oct 8, 2012

Not a rocket scientist, but using a car analogy, this sounds like the engine block vented gases and blew off the hood of the car, while the engine block itself received no damage.

ChuckMcM · on Oct 8, 2012

Looking forward to the SpaceX update.

One of the things that struck me is that trying to watch the control room at the same time as this anomaly I have yet to pick out anyone who 'flinched' or made any sort of noted move. That has left me wondering if they knew when this happened that it happened. I have to believe they did.

I remember watching the faces of the people in the control room when they did TV shots of the control room of NASA and noting that there was always someone who knew that things weren't going to plan, their face betrayed that knowledge.

That said, it looks like their primary cargo was fine, but they ended up putting their secondary cargo into a 'backup' orbit.

trafficlight · on Oct 8, 2012

I suppose they saw changes in the numbers, but they were still hitting their marks, so they didn't worry too much. I watched the livestream, and honestly, I didn't even know something bad had happened. Maybe they weren't glued to the video but rather the numbers on their screens?

Although, historically it seems that NASA control teams pride themselves on their ability to stay levelheaded during bad situations. Are there a lot of ex-NASA employees at SpaceX?

ChuckMcM · on Oct 8, 2012

Its a lot of data to comprehend. Doing operations on a large cluster architecture has a similar issue which is that there are many things that can go wrong and are 'ok' because the system is designed to deal with it seamlessly. But I'm a big fan of tools that let you have a red/yellow/green indication of nominal/warn/fail for each sub system. Looking at such a display you can comprehend the entire system is 'ok' (or not).

caw · on Oct 8, 2012

I manage a lot of complex data where many things could result an undesired state. Traffic lights don't always illustrate these conditions well enough, or let you diagnose the problem quickly.

Take the somewhat simple example of rebuilding a drive array.

All drives = green;

Within RAID tolerance = yellow

Degraded = red

What about the status of the hotspare? Is yellow rebuilding? What if there's no more hot spares, but you're within RAID tolerance?

I'm not knocking traffic lights (sometimes they're good), but it's worth spending time trying to figure out your data displays to give you the whole picture. It takes a while, but makes monitoring so much easier.

ChuckMcM · on Oct 8, 2012

Actually large format data displays are great for calling out unexpected transitions. I had a chance to look at the control panel of a nuclear reactor for an aircraft carrier once, and what I learned was that the panel had annunciator lights that would blink on an 'unexpected' change. So for example steam pressure would by static in one of two states (cold shutdown, operation) and smoothly changing when state changing, so the light would start blinking if either the steam pressure changed when the reactor wasn't changing state, or if it stopped changing in the correct way when the reactor was changing state.

Looking across a sea of lights, if you saw one that was blinking you would go over to it, read off the legend (that would tell you the conditions under which that light would blink) and then take action. Once action was in process you pushed the button to 'acknowledge' the anomaly. (which also set new parameters for when it would start blinking again)

All in all a very cool system. I've been sorely tempted to hack something similar for our search and crawling clusters. Sadly I don't think its practical to have a mockup of the Enterprise-D warp core which pulses in response to query rate, although that would be very very cool :-).

THE_PUN_STOPS · on Oct 8, 2012

Given that the rocket is specifically designed to operate normally with an anomaly such as this, I'd say they had faith in their rocket to perform as it did. I think the statement at the end of the update sums it up well:

>Falcon 9 did exactly what it was designed to do. Like the Saturn V, which experienced engine loss on two flights, Falcon 9 is designed to handle an engine out situation and still complete its mission.

priv_acy · on Oct 8, 2012

I don't know if this is how this particular mission worked, but when I worked at NASA, there were multiple control rooms, even spanning multiple space centers. It's entirely possible whoever was monitoring this was not where you were looking.

vsearch · on Oct 8, 2012

There was no explosion: SpaceX CRS-1 Mission Update: October 8, 2012 http://www.spaceref.com/news/viewpr.html?&pid=38825

From SpaceX "Approximately one minute and 19 seconds into last night's launch, the Falcon 9 rocket detected an anomaly on one first stage engine. Initial data suggests that one of the rocket's nine Merlin engines, Engine 1, lost pressure suddenly and an engine shutdown command was issued immediately. We know the engine did not explode, because we continued to receive data from it. Our review indicates that the fairing that protects the engine from aerodynamic loads ruptured due to the engine pressure release, and that none of Falcon 9's other eight engines were impacted by this event."

glimcat · on Oct 8, 2012

Maybe it's the engineer in me, maybe it's the astronomy geek who watched two shuttles explode.

But I am way more impressed with "engine failed, still got to orbit safely" than I was with the already titanic feat of making it to orbit in the first place.

ynniv · on Oct 8, 2012

When pushing the envelope, anything better than catastrophic failure is success. That an engine exploded and both primary and secondary missions were still completed is fantastic.

cowsaysoink · on Oct 8, 2012

Rocketships are insane, there has to be many many fail safes because failures happen all the time. I remember a NASA engineer giving a talk on engineering safety when I was going to school where he said that the probability of catastrophic failure in NASA's launches is 1 in 100 and that is as low as they are able to make it at this point in time.

jmharvey · on Oct 8, 2012

Closer to 2 in 100: 2 shuttle losses in 135 flights, plus 1 catastrophic failure out of 32 missions in the pre-shuttle era.

Cushman · on Oct 8, 2012

This is great for SpaceX. A more-or-less anticipated event that was handled perfectly by automatic systems is being reported as a potentially catastrophic failure. They get all the great press for proving the safety of their design without having to go through the stress of an actual potentially catastrophic failure.

jerrya · on Oct 8, 2012

Fantastic news!

spdy · on Oct 8, 2012

Those engineers who designed this system can be proud. The worst possible problem occurred and the backup plans worked out perfectly.

tsotha · on Oct 8, 2012

This isn't the worst possible problem. Engines can fail in a far more catastrophic fashion. If the surrounding engines and superstructure had been damaged things wouldn't have turned out so well.

riffraff · on Oct 8, 2012

they most certainly should, but I'd argue only "The worst possible _planned_ problem occurred" :)

salem · on Oct 8, 2012

There is a difference between planned and anticipated.

InclinedPlane · on Oct 8, 2012

Some important points on this subject.

First, the 1st stage has redundancy against engine failures however the 2nd stage (which uses largely the same engine as in the 1st stage) has only one engine. So if the per-engine failure rate is too high that could spell bad news for overall vehicle reliability even if the launcher can survive 1st stage engine failures remarkably well. Some reasons to be optimistic: 2nd stage engines have much lower aerodynamic loading and don't have to operate through "max-Q" as the 1st stage engines do (which was incidentally the point in time that the engine failure on this flight happened).

Second, the Falcon 9's 1st stage engines are arranged in a 3x3 grid which seems to result in some unfavorable aerodynamic forces on the engines on the corners. It's possible that this contributed to the engine failure (which occurred in a corner engine) and it's also possible that it contributed to the destruction of the engine fairing after it was shut-down.

Third, the particular engine in use on this vehicle (the Merlin-1C) will only be used on one more flight before being replaced (in the Falcon 9 v1.1) with substantially redesigned Merlin-1D engines (in both the 1st and 2nd stages). Additionally, the engine arrangement on the first stage will change to be octagonal, radially symmetric instead of a grid.

It's good to know the systems and structures to protect against 1st stage engine failure work well, however a lot of the reliability analysis up to this point is somewhat obsoleted by the imminent change in design. I suspect that the engine layout and upgrade will lead to greater overall reliability, but it will take several flights to prove that.

Anyway, some things to chew on.

jlgreco · on Oct 8, 2012

Perhaps also worth mentioning is that the proposed launch abort system is supposed to work at any point of the mission. An engine failure in the second stage would in theory not be a death-sentence to anyone who happens to be riding.

InclinedPlane · on Oct 8, 2012

Yes! More so if they have the data and the flight history to show that they can shut down malfunctioning engines before anything worse happens, which raises the survival probabilities of a manned capsule considerably.

cpeterso · on Oct 8, 2012

> the Falcon 9's 1st stage engines are arranged in a 3x3 grid which seems to result in some unfavorable aerodynamic forces on the engines on the corners.

Why are the engines arranged in a 3x3 grid instead of something symmetrical like a circle?

eco · on Oct 8, 2012

The Falcon 9 v1.1 will switch to an 8 engines circle with another in the middle layout (as well as using the more powerful Merlin 1D engines). The v1.1 is expected to replace the current v1.0 next year.

archon · on Oct 8, 2012

Uninformed guess: They didn't have enough space for anything fancier than a grid pattern. Perhaps the new engines have a smaller footprint.

hga · on Oct 9, 2012

I've read one of the changes in the new engines will be their using their own turbopumps. It's easy to imagine how their previously sourced ones wouldn't allow an internal layout that was optimal for the external layout of the engines.

jamesaguilar · on Oct 8, 2012

Then again, the probability of failure on the single-engine component is only a ninth the probability of failure on the nine-engine component (assuming there are no other factors, which of course isn't a safe assumption, but I don't know how to adjust).

shorttime · on Oct 9, 2012

How do you know all of these things? I would like to readup.

MikeCodeAwesome · on Oct 8, 2012

As the article surmised, one of the engines did indeed fail and the craft corrected for the failure.

"Falcon 9 detected an anomaly on one of the nine engines and shut it down. As designed, the flight computer then recomputed a new ascent profile in realtime to reach the target orbit…"

http://www.parabolicarc.com/2012/10/07/falcon-9-suffers-engi...

zhaphod · on Oct 9, 2012

I think it is important to read the latest press release from SpaceX where they clearly state that the engine did not explode.

http://www.spacex.com/press.php?page=20121008

I really wish that we had a perfect launch but that's not the reality. I don't think, technically, this hurts SpaceX. But once the perceptions are formed it is hard to change them even if you throw mountains of data/facts at them. Case in point the death panel buzz word that was used against Obamacare. I really wish this anomaly doesn't harm SpaceX.

matt2000 · on Oct 8, 2012

Am I correct in thinking that previous rocket designs don't have any redundancy built in? This seems like a big improvement in reliability but I'm not super familiar with other rocket designs.

jccooper · on Oct 8, 2012

Some do, most don't. Saturn V could (and did several times) survive failure of a first-stage motor. STS (Shuttle) theoretically could survive an engine loss under some conditions. The Soviet N1 could well have survived a motor loss (it had 30 of them after all), had it flown and had they been able to control all those engines.

Historically, most rocket designs push the performance envelope so hard they have little or no margin. Much of this attitude is historical, government rockets mostly being descended from ICBMs. The other part is that the rocket equation severely penalizes extra weight, and the window between "robust" and "too heavy to fly" isn't all that large.

lutusp · on Oct 8, 2012

> Am I correct in thinking that previous rocket designs don't have any redundancy built in?

The answer depends on specifics. The NASA Space Shuttle was able to reach orbit after the failure of one of its three engines, but only if the payload and/or altitude weren't near their range extremes. In other circumstances, the timing of the failure might determine whether the mission could proceed.

When I worked on the Shuttle, one design guideline was that no single-point failure modes should be allowed if it was possible to avoid them. Obviously this guideline was frequently not met. A one-word summary describing the avoidance of single-point failure modes is "redundancy".

codex · on Oct 8, 2012

I didn't know this before, but SpaceX has only done eight launches so far--half of them test flights. Perhaps this explains the anomaly.

natep · on Oct 8, 2012

I doubt that. If a software startup designed and built a distributed system that would automatically detect and recover from hardware faults (see: every 'design for failure' post ever), would you say that the hardware faults were due to inexperience or incompetence? Building redundancy into the system was a conscious decision, and they weren't "lucky" to have flown through this anomaly.

codex · on Oct 8, 2012

You misundersand me. Failures are not due to inexperience. High failure rates are. If a single failure occurs in one out of eight launches, a double will occur once every sixty four. While I believe the Falcon is double fault tolerant, I don't believe for a minute that this is the first failure seen in a SpaceX launch. If they've seen only one other failure in their eight launches, their overall failure rate (resulting in loss of rocket, possibly cargo and crew) would be one in 64. That is nearly the failure rate of the Shuttle. With more experience, they may be able to lengthen their mean time to loss (MTTL) by improving the failure rate.

natep · on Oct 8, 2012

First of all, this launch was not a failure, full stop.

I don't know what your background in statistics is, but I'm impressed that you're able to deduce the details of a such a complicated, stochastic process, from only 8 observations, and are willing to extrapolate your predictions for 8 times as many more.

And for someone who loves to comment negatively on SpaceX/Tesla posts, maybe you could spend 5 minutes looking at their Wikipedia pages and see that yes, there have been failures (i.e. unable to achieve stated mission goals and sometimes destroying payloads).

priv_acy · on Oct 8, 2012

The mission wasn't a failure, but a major component failed in a way that is very concerning. There is a lot of work to be done before a sane human being will get in one of those, let alone approach the safety record of aircraft that Musk is so fond of alluding to.

natep · on Oct 9, 2012

If you feel that you are qualified to say it is a very concerning component failure based on the scant evidence available, then that's up to you.

priv_acy · on Oct 9, 2012

I am, and of course it is concerning - the engine shut down and debris was strewn about. Do the math on the failure rates. Unless things are dramatically improved (the goal of course), these things are just not safe for people outside of the dare-devil set. That's not a knock against Space X - this is hard stuff. It is a slight knock against Musk's over-the-top marketing that has us on Mars in 15 years, which, in my opinion, is unrealistic.

saraid216 · on Oct 8, 2012

This was basically equivalent to a hard disk failing inside a RAID array. Though I think hard disk failures are more common than engine failures.

ceejayoz · on Oct 8, 2012

Earlier HN discussion: http://news.ycombinator.com/item?id=4626866

notjustanymike · on Oct 8, 2012

Never before has "try { ... } catch (e) { ... }" been so important

btilly · on Oct 8, 2012

Actually my understanding from talking to their first software developer is that their systems are written in C++ and they do not use exceptions. Ever.

If that surprises you, consider that the default behavior of an uncaught exception, anywhere in your code, no matter how minor, is to crash your program. While you're in flight, the last thing that you want to see is a software crash. Having software encounter an unanticipated state might or might not destroy the rocket. Having your control system spontaneously cut out in flight definitely will destroy the rocket.

WalterBright · on Oct 8, 2012

This is incorrect. In avionics software, you want software that self-detects its own failure to quit immediately, and engage the backup system.

You do NOT want it to "soldier on" once it has entered an unknown state.

The general problem with error codes is they can be so easily ignored, and then the software is operating in an unknown and untested state.

I suspect the actual reason why they eschewed exceptions is because exceptions may not be able to guarantee hard realtime latency.

quotemstr · on Oct 8, 2012

I've actually considered getting into aerospace software development. It sounds awesome --- after all, you're writing software for devices that leave the planet! --- but what's given me pause is that the software creation process itself is so necessarily conservative that I'm afraid the process would strip all the joy out of the actual coding.

notjustanymike · on Oct 8, 2012

(Although to be fair, I sincerely doubt Falcon 9 is running javascript)

habosa · on Oct 8, 2012

That could also be Java.

avar · on Oct 8, 2012

Because waiting on garbage collection is definitely what you'd want in a rocket.

anonymouz · on Oct 8, 2012

Judging from the debris on the video, they didn't collect their garbage.

malkia · on Oct 8, 2012

And never will. As if "try/catch" was involved the rocket would've just failed, rather than "continue/resume".

waratuman · on Oct 8, 2012

Misleading title.