Fundamentals of Caching

Caching Basics: Why? When? What? Where? Writing, replacement, and invalidation Strategies

Caching is essential for the scalability and performance of data-intensive applications. But it can be a nightmare if not well-designed or well-implemented

Let me share three stories with you:

Story #1 Some time ago, I helped a company which was facing performance issues in a .NET web application. With no apparent reason, that application was consuming large amounts of RAM and CPU time. After some analysis, we discovered that the problem was an unhappy “in-memory caching strategy” that was throwing away thousands of small objects in 5 minutes intervals.

Story #2 Another company was facing massive high network usage. After a short analysis, we concluded that the cached objects were far bigger than the ideal, used frequently, and the application was using a naïve server-side distributed caching strategy. So, the network bandwidth was the bottleneck.

Store #3 Another company was facing problems with SQL Server. Reason: They were performing very similar and expensive queries, with aggregations, to build over and over again the “last quarter sales summary” report.

What all these stories have in common is a bad or inexistent caching strategy.

In this series, I will share with you strategies for good caching design and implementation. But, let’s start with the basics.

Cache, Huh?

A cache is an intermediary data store (hardware or software) that can serve its data faster than the original data source (database, web service, etc).

Why should you care about caching?

Caching is essential because it can improve application performance significantly, saving your users time, and your company’s money.

For sure, it introduces a new category of architectural and implementation complexities (where complexity means cost). But, if your application relies heavily on network, disk, and other slow resources, a suitable caching strategy can save the day.

When caching is important?

Considering the requests that your application needs to handle, answer the following questions:

  • Does the application need to access resources that are hosted externally (in another hosting context)?
  • Does the application need to access resources that do not change frequently?
  • Does the application need to access the same resources over and over again?
  • Does the application produce the same output frequently?
  • Does the application need to run extensive aggregation calculations?

If your answer is yes for one or more of the questions above, you should start to think about caching. And, if you answered yes for all the questions, then you should begin to implement some kind of caching strategy right now.

What should be cached?

The results for the following processes are good candidates for caching:

  1. Long-running queries on databases,
  2. high-latency network requests (for external APIs),
  3. computation-intensive processing

Where the cache should run?

For client-server applications, a cache can be:

  • client-side, implemented in the client (for example, the browser)
  • server-side, implemented in the server
  • somewhere in the middle (CDN)

In distributed scenarios, the cache could be:

  • local, in the same machine (node) where the application is running
  • remote, in another computer, accessible through the network infrastructure.

Finally, whenever the cache is implemented locally, it could be:

  • in-process, running in the same operating system process of the consuming application (in-memory)
  • out-of-process, running in another operating system process.

There are pros and cons for each of these types. The right option depends on the context.

Cache writing strategies

There are two common strategies to write data in a cache:

  1. Pre-caching data, for small pieces of data, usually during the application initialization, before any request.
  2. On-demand, checking first if the requested data is in the cache (if the data is found, it is called a cache hit), using it, improving the performance of the application. Whenever the requested data has not been written to the cache (cache miss), the application will need to retrieve it from the slower source, then writing the results in the cache, thus saving time on subsequent requests for the same data.

Cache replacement strategies

Frequently, the cache has a fixed limited size. So, whenever you need to write in the cache (commonly, after a cache miss), you will need to determine if the data you retrieved from the slower source should or should not be written in the cache and, if the size limit was reached, what data would need to be removed from it.

Unfortunately, there is no “one solution fits all” here. Some times, the best strategy would be to remove from the cache the least recently requested data. But, there are situations where you will need to take in the consideration other aspects, like, for example, the cost of retrieving the data again from the slower source.

Any cache replacement strategy will have its drawbacks, and you will need to consider the context of your application. The best approach will be the one that results in fewer cache misses.

Cache invalidation strategies

Cache invalidation is the process of determining if a piece of data in the cache should or should not be used to service subsequent requests.

There are only two hard things in Computer Science: cache invalidation and naming things. (Phil Karlton)

Since the application should rarely be serving stale or invalid data to the users, we should always design some mechanism for invalidating cached data.

The most common strategies for cache invalidation are:

  • Expiration time, where the application knows how long the data will be valid. After this time, the data should be removed from the cache causing a “cache miss” in a subsequent request;
  • Freshness caching verification, where the application executes a lightweight procedure to determine if the data is still valid every time the data is retrieved. The downside of this alternative is that it produces some execution overhead;
  • Active application invalidation, where the application actively invalidates the data in the cache, normally when some state change is identified.

Architectural implications

With all these pieces of information, there are some architecture takeaways that we should take into consideration:

  • It is essential to know the performance and availability requisites for the application and consider that when designing the caching strategy;
  • It is essential to know the slower sources of data and expensive computations of the application to determine how caching could help us to achieve the performance and availability requisites;
  • It is helpful to define ways to measure the cache efficiency (cache hits/misses ratio) from the scratch;
  • It is crucial to understand the usage scenario to define the best replacement and invalidation strategies;
  • When adopting a non-local caching system, it is vital to measure the impact in the network infrastructure;
  • When choosing a local+in-process+in-memory strategy, it is vital to consider the implications in the runtime environment (mainly over the garbage collector);
  • It is fundamental to learn to calculate the costs of our choices (CDN, network, memory)

That’s all I have to say … for a while

In this post, I touched the basics of caching. In the next posts, let’s explore the fundamentals and usage scenarios for all the aspects we started to discuss here.

Do you have some story to share about caching? Share your thoughts in the comments.

Remember: Performance is a feature. Designing good solutions and writing good code is a habit.

More posts in Fundamentals of Caching series

Leave a Reply

Your email address will not be published. Required fields are marked *