LRU Cache Implementation

One of the problems any developer will eventually have to resolve is one of latency; specifically, being able to retrieve and process data in a timely fashion.  This issue can come in many guises but they generally manifest as needing to read data from a backing store that cannot deliver the high performance needed by the application. This can be a tricky problem to solve but the general method is to implement some form of caching. The remainder of this article will discuss one caching mechanism, called the LRU Cache.

Firstly, let’s define what we mean by “cache”. Put simply, a cache is a bounded storage mechanism that allows low latency access to data. In other words, it’s a temporary storage area that can hold a sub-set of the data you’re working with to provide faster access to that data than you’d otherwise expect if retrieving the data directly from the backing store. When I use the term “backing store” I am referring to any mechanism that is able to store data for retrieval, such as local disk, remote database, web service or, in fact, anything that is capable of serving data to your application.

So, what is an LRU Cache? It turns out that most of us have used an LRU without even realizing it. If you’ve ever used Microsoft Word, or, in fact, any application that allows you to open up your recent documents, you are actually making use of an LRU. LRU stands for Least Recently Used and it refers to the eviction strategy employed by the cache. As noted before, a cache is bounded. This means that it has an upper limit on the amount of data it can store. Once this upper limit is met it is necessary to evict some data. With an LRU the least recently used data is purged. In other words, if it’s not been used for a long time the changes are we don’t need it anymore.

There are plenty of strategies for eviction but the LRU is probably the most common as it provides a good all-round strategy for most use-cases and requires no specialize knowledge of the problem domain. In other words, it uses a simple, naive but generally quite sensible strategy that is based upon usage of the data rather than the data itself. Other, more complex caches require specialized knowledge either about the data or the way the data will be used.

One of the most common uses of an LRU is as a Page Replacement Algorithm (PRA). All modern Operating Systems make use of Virtual Memory. This allows the OS to run far more applications that would otherwise fit into memory. It works by using the disk as a temporary memory store; writing memory “pages” to disk when they are not needed. If the physical memory is becoming full the PRA will look for pages that can be paged to disk. This is often done by purging those pages that haven’t been needed for a while. It is reasonable to assume that if they’ve not been used for a while they are likely to be less important than a recently used page.

Whilst this strategy isn’t perfect and there are times where it can backfire (who hasn’t had a computer that has started “thrashing”?) As far as it goes for the most part this is a reasonable strategy that will fit well with most latency problems. In general, an LRU is a good place to start and you should only start implementing more complex strategies if performance profiling with “representative” data suggests otherwise.

Ok, so how complicated is it to implement one of these LRU Caches then? As it turns out it’s actually not that difficult at all. You can think of an LRU Cache as a bounded key value store. Of course, you need a key because you need to be able to find your data once it’s put in the cache. The value is, of course, the data you’re caching. We can implement a simple LRU Cache using nothing more than std::map (or std::unordered_map) and std::list.

The list is used as a queue and represents the LRU. The order of the queue determines the priority and order for eviction. The size of the list is bounded, with recent items being pushed into one end and the least recent items popping out the other. Let’s assume we decide to bound the list to 10 items. We can push 10 items into it no problem. What happens when we push item number 11? Simple, the first item we push in (item 1) is popped off the end of the queue to make space for item 11.

Simple eh? Not quite. You see, we have one special case we need to take care of. What if item 11 is actually item 5? In other words, we’re using something again that we used previously that is still in our list. Well, that’s also quite simple. We just remove the original item 5 from the list and re-queue. Ah, but there’s now another problem. How do we find the item in the queue? Sure, we could do a linear search for it and in a list of only 10 items that’s probably going to work out just fine. What if the list has 5 million items in it? A linear search is not going to work too well; it’ll be way too slow.

The answer is to use an index, which tracks the items in the list so we can find them quickly. We do this by using a map (or unordered_map), that has the item key as its key and an iterator to the item in the list as its value. In other words, we use our index to map the key to the iterator that represents it in the list. The std::list has a nice properly that neither appending nor deleting items will invalidate iterators (other than the iterator of the deleted item). This means that if we need to handle an item we’ve seen again we can use the map to find the item in the list, erase the existing item and re-queue it again.

So, that’s the theory now let’s look at the code!

As you can see, it’s a pretty simple data structure to implement, with the code not really doing much more than marshalling between the list and the index to ensure the eviction policy is implemented correctly.

The complete code for this, and other useful classes, can be found in my Amp library, which is hosted on Github.