Hacker News new | past | comments | ask | show | jobs | submit login

I would wager that they run a single configuration, as it grants a significant economy of scale, rather than vertical partitioning of their stack, which would require headroom per customer and/or slice. This way you just need global headroom.

Having done some similar stuff with varnish in the past (ecommerce platform), they’re likely taking changes in the control panel and deploying them to a global config - and someone put something lethal in that somehow passed validation and got published, and did not parse.




This looks a quite likely scenario.

But then we still don't know what they fixed, was is the incorrect configuration or the underlying bug? I would expect the former instead of the latter, because it is probably not very difficult or dangerous to change that specific configuration while fixing bugs in the code seems riskier and would probably take more time for testing.

We'll see if they will publish a post-mortem. It has become more or less a normal custom these days (and they are frequently quite interesting).


They were pretty clear about this in their response (linked in the article)

    Once the immediate effects were mitigated, we turned our attention to fixing the bug and communicating with our customers. We created a permanent fix for the bug and began deploying it at 17:25.
So they did both. First reverted the config then later fixed the bug.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: