Hacker News new | past | comments | ask | show | jobs | submit login

Because that's usually the case. Crunch time, disaster recovery, and emergency fixes are common in every sector from video games to aerospace. If you can't fix then switch to a secondary, or rebuild from backup, or throttle users, or process manually, or do anything other than be completely down.

RH wasn't prepared with any contingency. They should have a resolution for their users - even if they can't find or fix the original cause. That's the failure I'm talking about.

See the 2 other users in this thread that describe similar high-pressure situations.




Depends on what the issue is. Could be something that can be quickly fixed or not. Although going by the lack of a resolution I’m assuming it’s not.

Reality is, without knowing more about what’s causing this it’s impossible for either of us to say. If there is indeed some fundamental bottleneck that was previously not known, then I certainly won’t be surprised if it takes a while to sort out.

Now you can say they should’ve load tested, capacity planned etc etc. But we are where we are. Still can’t go back in time to turn this into a quickly fixable problem if it’s currently not.

Edit: also pretty disappointed that we don’t know more about the root cause. As an user I’d want to know what the issue was and what they are planning to do to about it to evaluate if I should trust them going forward.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: