Why not just use LRU?

zaarn · on June 18, 2018

CLOCK and CAR can perform a bit better than LRU in certain situations.

Notably, CLOCK keeps items that are accessed atleast once during a round while LRU will kick out the least accessed item.

The benefit of using CLOCK is that you don't have to maintain a list but only a ringbuffer. Removing an item for a CLOCK's buffer can be almost free if you use a single bit to indicate presence. A LRU will have to maintain some form of list, array or linked. In practise, LRU is expensive to implement while CLOCK is simple. CAR offers LRU performance with less complexity.

Symmetry · on June 15, 2018

Performance. The Wikipedia page on cache policies is pretty good.

https://en.wikipedia.org/wiki/Cache_replacement_policies#Clo...

eternalban · on June 15, 2018

Wiki says "Substantially" better than LRU but actual results seem to show performance converging to same levels the larger the cache gets. (See page 5.)

https://dbs.uni-leipzig.de/file/ARC.pdf

[p.s. there is also the matter of the (patterns in the) various trace runs. Does anyone know where these traces can be obtained?]

ebikelaw · on June 15, 2018

All caches have equal hit rates in the limit when the size of the cache approaches infinity. For finite caches, ARC often wins. In practical experience I've found that a weighted ARC dramatically outperformed LRU for DNS RR caching, in terms of both hit rate and raw CPU time spent per access. This is because it was easy to code an ARC cache that had lock-free access to frequently referenced items; once an item had been promoted to T2 no locks were needed for most accesses. With LRU it's necessary to have exclusive access to the entire cache in order to evict something and add something else.

Of course there are more schemes than just LRU and ARC, and one can try to employ lock-free schemes more than I'm willing to do. This is just my experience.

NovaX · on June 16, 2018

ARC often wins against LRU, but there is a lot of left on the table compared to other policies. That's because they do capture some frequency, but not very well imho.

You can mitigate the exclusive lock using a write-ahead log approach [1] [2]. Then you record events into ring buffers, replay in batches, and have an exclusive tryLock. This works really well in practice and lets you do much more complex policy work without much less worry about concurrency.

[1] http://highscalability.com/blog/2016/1/25/design-of-a-modern...

[2] http://web.cse.ohio-state.edu/hpcs/WWW/HTML/publications/pap...

eternalban · on June 16, 2018

I don't believe the table in question appraoched "infinity". Check again.

NovaX · on June 16, 2018

I wrote a simulator and link to the traces. One unfortunate aspect is they did not provide a real comparison with LIRS, except in a table that includes an incorrect number. It comes off a little biased since LIRS outperforms ARC in most of their own traces.

https://github.com/ben-manes/caffeine/wiki/Simulator

eternalban · on June 16, 2018

Thank you, you are awesome!