Hacker News new | past | comments | ask | show | jobs | submit login
Cache Them If You Can (stevesouders.com)
136 points by enobrev on March 23, 2012 | hide | past | favorite | 15 comments



One of the largest improvements I've noticed in between caching and optimizing in PHP has been turning on APC. I'm not aware of any drawbacks, if you have any let me know. I didn't do any bench marking, but it did make a significant impact on page loads to my eye, as well as the stats on the page.


While backend caching isn't what the article is about, you're absolutely right that an opcode cache such as APC significantly improves backend performance. It's one of the first things I check for when I'm optimizing a client's PHP site.


This is a great suggestion.

A while ago I did some benchmarks with hooking up APC and Squid proxy to the site I was working on at the time. The results were pretty amazing. I saw an increase in capability/performance of 400-700%. I'd probably choose Varnish over Squid if I had to do it again, however. The Squid configuration file is a bit of a nightmare.

http://randomdrake.com/2009/07/14/benchmark-results-show-400...


> The HTTP Archive doesn’t save response bodies so the determination of “identical” is based on the resource having the exact same URL, Last-Modified, ETag, and Content-Length.

I don't like this. Dynamic pages don't usually send any of those headers, so the page will appear identical in his analysis, despite not actually being identical.


Dymanic pages properly programmed could and should send this headers. Also some frameworks and/or proxies manage this for you.


I bet Steve excluded files that didn't have these 3 headers.


To celebrate this article, I just set my cache size in Firefox to zero. Why? Aside from being a bit contrary by nature, I present the following...

Fact: My netbook is memory constrained and the network is often faster than my local disk.

Hunch: Caching content locally does nothing but thrash my disk, trash my FS cache and slow my entire computer down.

We'll see how it pans out. :-)


Cached HTTP resources are usually small in 10K's to 100K's. You meant your netbook is so memory constrained that it can't afford some megabytes of cache memory?


I am talking about on-disk cache, not in-memory cache.

Being memory constrained means I do not have a large filesystem cache and cannot afford to have it smashed by lots of cache reads/writes.

It also means the odds that a cached (on disk) resource is actually in RAM is low, so a cache hit involves multiple disk seeks which very quickly adds up to more latency than fetching over the network - especially when you consider that I have one disk which cannot run multiple seeks in parallel, whereas the network round trips can overlap.

Performance on modern computers very often boils down to avoiding disk seeks.


That's why I said http resources are typically small. They won't take up too much cache memory when brought in from disk. Filesystem caching is so good these days that if a file is used more than once, its disk seek time is negligible when amortized over its many read.

I'm pretty sure jquery.xx.min.js will be downloaded over and over again for every page if you don't cache it.


Seems I've been downvoted for an unhelpful answer. :-P

The problem isn't the size of each individual resource, it is the total volume of resources. In fact, the small size of web resources makes the problem worse, not better - a local disk beats the network when reading large volumes of sequential data (say, a movie). Small assets on the other hand, which you correctly assert are the majority, will result in lots of sequential disk seeks, whereas the network which can initiate many transfers in parallel can do much better.

Regarding the effect of the FS cache, I provide one unscientific data point: My Firefox cache had grown to 650MB in size - on a machine with 1GB of RAM. It is pretty obvious that the vast majority of those resources will not fit in the FS cache and will require some disk seeks to load and display. Reducing the size of the cache to something that more-or-less fits in RAM might make sense... but why bother with an on-disk cache at all then?

Constantly writing to the disk as I browse the web is also a major source of system load (cache churn, seeks, write traffic), and this effect is relative not to the size of the cache, but to the activity of my browser - turning the cache off eliminated this completely which may (or may not) offset any benefits provided by the occasional cache hit.

Finally, my testing implies that your worries about jquery and other shared elements within a website are needless, the browser also has a RAM cache and all the active things on a page live there. Clicking from page to page within a site does not trigger a reload of shared elements, even if the disk cache has been switched off.

I am still trying to come up with a way to accurately measure and benchmark the effects of local disk cache on the browsing experience. Once I have, I will run some tests and post the results.

Note that all of this is relevant to the OP, because the "call to action" of the OP was that browsers should increase the size of their disk caches. I suspect this would be, in many cases, a bad idea. I went and did the opposite, and so far my laptop - and the web - both feel snappier.


You are very sure of many things, which I have already tested and proven to be incorrect on my machine. So far, switching off the cache seems to be an improvement for me.

Your situation may be different, but I suggest you measure and test if the topic interests you. :-)


the network is often faster than my local disk

I'm wondering what kind of network you have...


I have one of those fancy networks that can open multiple TCP/IP connections in parallel, whereas my Firefox disk cache lives on one of those crappy old spinning disks which can only do one seek at a time. :-)


YSlow is a great tool in finding out whether caching is turned on for a resource.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: