In computer science, a cache is an easily-accessed memory used to store a subset of a larger pool of data that is expensive or slow to either fetch or compute. Repeated accesses to the data can reference the cache rather than refetching or recomputing, so that the perceived average access cost or time is lower.

The reason caches work at all is that many access patterns in typical computer applications have locality of reference. There are several sorts of locality, but we mainly mean that often the same data is accessed frequently or with accesses that are close together in time, or that data near to each other are accessed close together in time.

Table of contents
1 Cache hierarchy in a modern computer
2 Other caches
3 Cache management
4 See also
5 External link

Cache hierarchy in a modern computer

Typical computers have a hierarchy of caches. The hard disk, as well as storing local files, may be used to cache copies of files from tape drives, remote file servers, or (most commonly) web pages. Some of the main memory of the computer is generally used to cache portions of the hard drive. Portions of main memory are cached in a high-speed CPU memory cache, which itself may have one, two, or even three levels. The CPU will keep some of the data it operates on in a register file which some people view as software controlled cache.

Each of these levels of cache are smaller, faster, and more expensive per byte that the ones below. The large number of caches in the hierarchy reflects the incredible range of speeds and capacities of modern computers. A web page access may take ten seconds to load one of billions of possible pages. A disk drive access may take ten milliseconds to load a few kilobytes from the hundreds of gigabytes on the disk platters. Main memories, which are commonly made of DRAM, take 100 nanoseconds to load tens of bytes from a few hundred megabytes. Large CPU caches may take 10 nanoseconds to load ten bytes from a cache of perhaps a megabyte; small CPU caches may take 1 nanosecond to load the same ten bytes from a cache of tens of kilobytes. Finally, registers are accessed in fractions of a nanosecond, from the tens of registers in a typical register file.

Other caches

The CPU caches are generally managed entirely by hardware. Other caches are managed by a variety of software. The cache of disk sectors in main memory is usually managed by the operating system kernel. The BIND DNS daemon caches a mapping of domain names to IP addresses, as does a resolver library.

A cache of recently visited web pages can be managed by your Web browser. Some browsers are configured to use an external proxy web cache, a server program through which all web requests are routed so that it can cache frequently accessed pages for everyone in an organization. Many ISPs use proxy caches to save bandwidth on frequently-accessed web pages.

The search engine Google keeps a cached copy of each page it examines on the web. These copies are used by the Google indexing software, but they are also made available to Google users, in case the original page is unavailable. If you click on the "Cached" link in a Google search result, you will see the web page as it looked when Google indexed it.

The CPU register file, if thought of as a cache, is managed primarily by the compiler, and to some extent by the operating system kernel.

Cache management

There are basically three problems that must be addressed while the cache is in use. If the strategies for addressing these problems are simple enough, and speed crucial enough, these policies can be implemented in hardware, otherwise they are implemented in software.

The cache is inevitably too small to hold all the data referenced. Some data must be evicted when new data is brought in. The cache will perform best if the data evicted is that which is least likely to be referenced again. Heuristics which attempt to predict and conform to the future access pattern are known as the replacement policy. One popular replacement policy, LRU, is to replace the least recently used datum with the incoming datum.

The data set being cached may be changed by other entities, in which case the copy in the cache may become out-of-date or stale. Or, the computer may attempt to update the data in the cache itself, causing cached copies of the data in other computers (or other caches within the same computer) to become stale. Communication protocols between the cache managers which keep the data set consistent are known as the coherency protocol.

If the computer updates the data in the cache, that update must eventually be propagated to the long-term storage point of the data, and eventually to any other users of that data. The update propagation timing is set by the write policy:

There are intermediate policies as well. The cache may be write-through, but the writes may be held in a store data queue temporarily, usually so that multiple stores can be processed together (which can reduce bus turnarounds and so improve bus utilization).

Caching for reading access only is more common than for writing when operating over networks, because the coherency protocol may become exceedingly complicated if communication is not reliable. For instance, web page caches and client-side network file system caches are typically read-only specifically to keep the network protocol simple and reliable.

See also

External link