Hacker News new | past | comments | ask | show | jobs | submit login
Empty image src can destroy your site (nczonline.net)
41 points by mk on March 17, 2010 | hide | past | favorite | 23 comments



If your site is so delicately balanced that a single extra request can bring it to its knees then you have larger problems than a blank img tag.


If your site is one of the world's top three most-visited websites and every pageview generates a single extra request, then you bet your ass it's going to cause problems.


If your site is one of the world's top three most-visited websites and every pageview generates a single extra request and that brings you down, then like I said you've got bigger problems. Your infrastructure should not be so delicate that the difference between uptime and downtime boils down to a single extra hit per pageview.

I'm doing 500 hits per second right now, do I care if it becomes 600 or 700? Not at all, because that's still a nice, safe, comfortable distance from my maximum capacity.


Your "you've got bigger problems" then translates to "you should have capacity for at least double your normal traffic".

Capacity planning is complex enough that I wouldn't make such a blanket statement so judgmentally.


One extra hit per pageview is not necessarily double your traffic, it's probably below a 1% increase a lot of the time.


You seem to have the impression:

1. That what I said was hypothetical. It wasn't. This happened.

2. That one pageview amounts to no more effect on a server than causing it to process another request and send a few more bytes down the wire.

If you're serving a static HTML page, then an extra pageview probably isn't a big deal. If you're serving an uncacheable, dynamic page that must federate requests to many, many different backends and pull in live content in real time, one pageview is a big deal.

And even if you have enough capacity to handle 3 or 4 or 10 times your usual load, when dynamic pageviews suddenly double (or worse, if there are multiple sourceless images on the page) for no apparent reason, a whole shitload of alarms go off, and it's not the sort of thing you just ignore. It indicates a Problem. It might be just a Problem right now, but it could very soon become a Big Problem. Which is why you try to fix it right away, no matter how big your safety margin is.


Eh, sucks that it happened to you but looking at browser vendor's (lack of) reaction and even the bugzilla discussion ... I think you and yahoo might be the only ones to have ever had any significant repercussions come from this outside of some minor mobile network performance issue that sparked the bugzilla submission.


Riiight. Because browser vendors always fix important issues quickly. ;)

What's more likely is that very few large sites are experiencing this problem, and of the smaller sites that may have experienced it, many probably never even knew it, and those that knew it probably said "huh, weird", fixed it, and didn't bother mentioning it to anyone.

Nicholas Zakas and Yahoo! are trying to get browsers to actually fix the problem in order to prevent this from happening to other people, while also spreading the word so that people realize this is a potential problem.

One place other than simple server load where this problem could be extremely harmful is in measuring ad impressions. Even if the extra pageviews aren't enough to create a blip on your capacity radar, if you're telling advertisers that you served x-million impressions when only half of those were actually seen by users, it's going to seriously screw up your clickthru metrics, and your advertisers are going to be pretty pissed.


It's not so much that they're slow to respond, it's that they also own many of the ones that would be as susceptible as Yahoo. That combination suggests it's not or has not yet been a problem for most.


this one also "works" with CSS background images by the way. I had one case during development here with ~10 elements that had a background image set to url() (in a style attribute. don't ask. please). And of course, that page was really taxing the server doing a complex calculation.

Of course, finding this was actually a good thing, because we could not only fix the empty background image but als fix the whole page so that a) the processing is done in the background and b) the result is cached.

So if you are careful about your architecture, an empty image url or two really is a non-issue. If on the other hand, you are not careful, 10 empty image urls on the wrong page can really take down your machine.

There IS a difference between 50 and 500 concurrent users.


alarmist headline - it's a minor inconvenience at best. possibly difficult to track down why there are such duplicate requests going to your site and where the empty img tag is, but it's hardly going to "destroy your site".


Consider a very popular website which runs on the last generation of software; that is, it is susceptible to the C10K problem (tl;dr: thread/process-based concurrency starts to choke at ~10,000 simultaneous connections, largely regardless of vertical scaling). The server, or each server in the pool, might run at 75% load normally.

And then traffic doubles.

It's easy to imagine the hurt.


Traffic won't double. Total traffic is all requests (images, stylesheets, scripts), not just requests to the root HTML page.

Not only that, but the client's browser probably caches it anyway.


You're assuming that your static assets (js, css, images) aren't served from a CDN (pretty much the standard at companies like Yahoo!). When your main server is serving just the HTML it's much more likely that your requests could double.


Any website serving its assets from a CDN will be a) unlikely to make a mistake like this empty image src thing, and b) have a well-cached front-page that would be unlikely to impact on server load.


Maybe. It's still a bit alarmist though. For 99% of sites, it will barely have an effect. Best practices for websites serving millions of pages a day truly are different than for websites serving thousands of pages a day. They are also different for web apps versus largely static pages. There is too much emphasis on absolutes in the web development community.


My favorite thing about this article is that it made me stop and think through why the bug would actually exist in so many browsers.

I can make sense of the IE bug: "" doesn't start with a protocol, like "http://, so it's not a full URL. It doesn't start with "/", so it's not relative to the server root. Therefore, it must be a relative URL, and the browser tries to download the image named "" in the same directory as the page.

But the Safari and Chrome problem baffles me. To have happened in Safari, Chrome, and Firefox, it must be a pretty straightforward mistake, but I just can't see it. Anyone else have guesses?

(Safari and Chrome admittedly use the same rendering engine, but still, Firefox doesn't.)


Wouldn't his proposed method of returning nothing when URL==referer "destroy" a form that posts to itself (action="." or "")?


Depending on your web framework, this may cause problems unrelated to load. If the framework identifies components in the DOM tree via auto-generated IDs, the additional requests can cause these to be regenerated. In your JavaScript, or even somewhere in the server side code which didn't expect further requests, the old value might still be used, breaking the site.

I learned this the hard way recently while working on an Apache Wicket app, see my mailing list post: http://old.nabble.com/Nasty-problem-with-%22component-not-fo....


Another workaround is to assign a data url to the image tag, then replace the source with javascript.

Something like <img src="">


A second request wont destroy your website. Yes a second request takes a tiny bit more bandwidth, but realistically, its impact is nil.


Agreed (and actually saw this on 40M pv/day site on the inside myself), most of the stuff is cached by definition. Still a bad thing tho.


Useful link with good timing; this just came up today for me.

Wasn't destroying my site, but good to fix regardless.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: