Hacker News new | past | comments | ask | show | jobs | submit login

Unfortunately (and here's the human error), the URL of '/' was mistakenly checked in as a value to the file and '/' expands to all URLs.

Wow, one character. That was one expensive character.




Incidentally, I'm glad to know that I can now tell my clients that I have a Google-class QA procedure: I make my changes directly on the live site, without testing them, and I promise not to blow up the home page for more than an hour!

Six sigma, baby!

Seriously, though: We've all made similar mistakes before (though, after the first few times, we usually try to catch them on the dev server or the staging server). The trick for Google will be to ensure that this one is not made again.


It sounds to me as if they did, and as if the badware blacklist implementation was working fine, but was given bad data. Given the rapid proliferation of new badware, it seems reasonable to bulkload blacklist updates to an existing and tested backend. Obviously, given the scant details, I could be wrong, but my impression is that this is a wonderful demonstration of GIGO.


I'm not sure it's practical to test this on a dev server with each update. All they really need to do is make sure "/" or sites ending with "google.com" and maybe some other white listed sites are never included in the file.


I'm not sure it's practical to test this on a dev server with each update.

I tell myself that all the time. And then I pause, and sigh, and load up the dev site anyway, just to make sure the home page has not exploded. Because sometimes it has. Typos happen. Brain farts happen.

Obviously I wouldn't necessarily expect Google to test a dev server by hand every time they change some code. That procedure is for little people like me. [1] For something as important as Google I'd expect there to be a procedure whereby each new change gets pushed to a small group of boxes, which then run a few really simple automated acceptance tests ("If I do a search for a random term, do I get back a page with links that can actually be followed?") before the thing gets pushed live fifteen minutes later.

Apparently I expect too much. Perhaps I'm overengineering this. Maybe it's okay if, a couple of times a decade, we just cripple half the Internet for 30 minutes and give 5% of the world's computer users a virus scare.

---

[1] At least until I get lazier (in the Larry Wall sense of the word) and write some more scripts.


I've had production failures on very critical systems when files were edited and silently had their \n line endings replaced with \r\n line endings. Once we found the problem it was simple enough to fix, and we changed the files in question to be robust against such changes in the future, but that was an... unpleasant day.


An embarrassing flub, sure, but perhaps not expensive. By scaring people away from natural results, Google's ad clickthroughs may have gone up during the glitch!


She wanted dinner and a movie.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: