Hacker News new | past | comments | ask | show | jobs | submit login
40-60% of Yahoo's users have an empty cache experience (yuiblog.com)
35 points by nathanwdavis on Sept 25, 2009 | hide | past | favorite | 11 comments



We noticed this after finding an error in our apache config such that some of our assets were set to be cached longer than we really wanted (30 days), much to our chagrin. It turned out that even though you tell the browser to keep something for a very long time, it most likely won't stay there.

Our general assumption was that it is due to the default cache size of browsers being tiny in comparison to the amount of data that users now consume, coupled with every webpage you visit putting stuff into the cache (rightly so).

I believe the default cache of firefox is still set at 50mb which seems very small.


The default size for Firefox 3.0.12 on Ubuntu is indeed 50mb.

A fresh reload of CNN.com: 649 KB A fresh reload of YouTube.com: 113 KB A fresh reload of CDW.com: 331 KB

That's still a lot of sites in a 50MB cache. though I imagine a single youtube video would go a long way towards filling it up.


Quoting the comments section:

    The best possible way to achieve cacheability of an object is to perform
    a server-side re-write of all linked content (images, scripts etc) and
    re-write the links to refer to a file name based off the MD5 hash of the
    file content.
    
    So, your link:
    http://us.i1.yimg.com/us.yimg.com/i/ww/beta/y3.gif
    is re-written:
    http://us.i1.yimg.com/never-expire/A31D5F12.gif
    Where A31D5F12 is the MD5 hash.
Does somebody use this convention? It seems logical but I haven't seen it wildly used.


I first heard about this several years ago from a CDN; their proxy was automatically rewriting the links so the Web developer didn't have to do anything. I've never seen it used in the open source world.


Does this mess up what the new 'semantic web markup' movement is pushing? ... (I.e. google's image search uses the image name as a signal to what the image might be, albeit a small signal)


It shouldn't, you can just have something like static.example.com/picture-of-cat-playing-piano-0123456789abcdef0123456789abcdef.jpg


Don't rewrite the URL completely, just append your session-data, caching hash and other metadata to the URL path.


I've seen it used, yes. I don't personally use MD5, but I've used variations based on modification timestamps.


So does this constitute an indirect measurement of the size of the cache-savvy porn-consuming population?


Perhaps browsers could have a white list that the user could modify and have a select view sites avoid losing the cache even when clearing the cache from a menu option...

I think it might constitute :)


Firefox:

Tools->Start Private Browsing.

Comes handy when you're giving someone your laptop to surf and you don't want them polluting your history (it distracts me when firefox auto-completes something I don't recognize)




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: