Advantages and Disadvantages of Cache Warming with Varnish Cache and CDNs

What is Cache Warming?

Cache warming is a process that some engineers follow to try and improve the performance of their website.

Many websites depend on caches. A cache is a system that stores portions of the website in high-performance storage to help avoid reading from systems that have poor performance or to reduce pressure on bottlenecks in the system.

Caches exist in many places in your website setup. For example, there are caches within your CPU, built into your database, and even within specific applications like Redis or Memcached.

How do I warm my Varnish Cache?

In order for Varnish Cache to store an asset, it has to receive a response marked as cacheable from the origin server to an incoming request for that asset. Every time the TTL (time-to-live) for a cached asset expires, Varnish Cache removes it. In order for an updated copy of that asset to be reintroduced to the cache, Varnish Cache needs another cacheable origin response to a request for that asset.

Practically, this means that every now and then one of your users will receive a slow response for an asset that Varnish Cache should be serving quickly because the asset has expired. and their request is going all the way to the origin and back. This one slow request refreshes the asset in cache and all additional users receive a quick, cached response, but our first user nevertheless has a bad experience.

Warming a Varnish Cache is a technique designed to shield users from this inconvenience by making those necessary but slow cache-refreshing requests yourself. You make a series of requests to your server for cacheable assets and you get the slow responses needed to refresh the cache instead of your users.

Manually cache warming your web site in a browser, however, isn’t a very repeatable process. You won’t want to sit at your computer and click through every page on your website.

What tool can I use for cache warming?

If you look around the Internet for cache warming tools you’ll find a bunch of tools of varying quality.

When our users ask us to help them with cache warming, we generally do not recommend these tools, because there’s a good chance you already have a great tool at your disposal: WGET.

Wget has a command line argument to tell it to be “recursive”. This means that you can point Wget at a single page on your website and it will download the content. Then, it will scan the page’s HTML for any hyperlinks and fetch those pages as well until it has run out of links. This approach will cache all of your websites’ assets, but it has some significant trade-offs that we’ll explore below.

Here’s an example that you can reference to cache warm your website with Wget: https://gist.github.com/thomasfr/7926314

What are the advantages of cache warming?

The advantages of cache warming are that you can try and get content into your cache ready for user traffic without making them experience slow, non-cached delivery times. This means that your users will put less traffic on your backend servers — delivering a better UX experience for your customers and protecting your backend infrastructure from excessive traffic.

What are the disadvantages of cache warming?

In a cache, web traffic is not generally evenly spread across all the objects. For example, your home page will be requested much more frequently than your privacy policy page.

Generally speaking, it is unlikely that you’ll have a cache large enough to cover all the content that your website holds.

If you write a cache warming system that starts at your home page and then crawls your website by following links (like the wget example above) then you’ll make one HTTP request for the home page, and one for your logo file, one for the CSS files and JavaScripts. Varnish Cache will store all the cacheable responses. Then the crawler will move onto the next link it finds on the page. It will download that, Varnish Cache will store the cacheable assets, then the next link, and so on.

Because of Varnish Cache’s typical memory constraints, if you indiscriminately crawl your site as described above, it’s possible that you will run out of memory and begin to overwrite important assets stored at the beginning of the crawl, like your home page, with trivial assets found at the end of the crawl, like an image from your privacy page. This severely limits the speed benefits of Varnish Cache.

Next, you need to think about how your cache is distributed. If you are trying to cache warm a CDN, which Point of Presence (PoP) is your cache warming script talking to?

If you run the cache warming script from a central location, it is expected that your cache warming agent will connect to the PoP in the CDN network that is geographically optimal for the server doing the cache warming. So when you run your cache warming script you may only be warming the cache in a single location.

Finally, you need to consider the configuration of the cache.

Do you have rules configured in your cache for different countries? Different browser types?

Your cache warming script would also need to be aware of the way you have configured vcl_hash. If you have different versions of your pages for desktop and mobile devices, be sure to configure your crawler to fill the cache for both device types. Similarly, if you have different versions of content in your cache for different countries (pricing and currency details, perhaps), or for different languages, be sure to consider that in your cache warming design.

Let’s Recap

Cache warming may benefit your users by improving site performance or preparing the origin application for initial user traffic.

Getting it right, however, isn’t straightforward — you’ll need to think about your specific circumstances:

  1. What are the most important assets to have in cache?
  2. What is the size of my cache’s memory?
  3. Are my caches distributed across a Content Delivery Network? If so, which cache am I trying to warm?
  4. Are there configurations in my cache that I need to be aware of?

One recommendation

With all these things in mind, we generally recommend not worrying about cache warming efforts.

When natural traffic is used to prepare the cache there may be minor performance degradations when content is refreshed. However, the caching platform is generally best optimized when primed via natural cache rules. You don’t need to consider the above items, the cache will organize itself to best suit the traffic that your users are generating.

To learn more about how to get started with Varnish Cache, including writing Varnish Cache Configuration Language to cache content for your application please download the full Varnish Cache Guide. If you have specific questions about Varnish Cache and VCL check out our community forum or contact us at info@section.io and one of our Varnish Cache experts would be happy to help you.

varnish cache tutorial

Blog Categories

Interested in articles about a specific topic? Click on a category to see all related content.