Caching Stale Content

What is Caching Stale-While-Revalidate or Extending Grace?

Caching makes pages load more quickly by drawing on a previous copy of a response instead of re-fetching it over the network. offers developers total control over their Varnish Cache configuration, enabling you to determine the features you need to maximize web speed by caching static and dynamic content, including full HTML documents.

Certain useful caching features are less well-known than others and in this post, we are going to concentrate on what is known in other parts of the caching world as stale-while-revalidate, a HTTP cache-control extension. Varnish by contrast, calls this same feature, ‘extending grace when the backend is down’.

Two independent HTTP cache-control extensions for stale content were proposed as an industry standard back in 2010 to “allow control over the use of stale responses by caches”, yet the concept itself remains a relatively little-known cache feature. It can be used to help improve cache hit rates, particularly on items that are only requested infrequently. Additionally in instances in which there is a problem with your origin server or when it is taking a long time to fetch new content, using this feature will mean that Varnish Cache continues to serve the previous (i.e. stale) cached content as users request it, leading to lower latency and a better user experience.

How do you Cache Stale Content with Varnish Cache?

In earlier versions of Varnish, stale-while-revalidate was only possible using an “evil backend hack” - achieved by permanently having a backend which was down, setting it as the backend for the request, allowing grace to apply itself and finally restarting the transaction. However, since Varnish 4.0 was released in 2014 (we are now on Varnish 6.0), such hacks are no longer necessary.

In Varnish 4.0 on, Varnish Cache will always prefer the fresh object, but in instances where it can’t be found, it will look for the stale one. Once detected, the stale object will then be delivered allowing Varnish to initiate the asynchronous request. Essentially, Varnish serves the request with a stale object while simultaneously refreshing it. The VCL that has been added as standard (in builtin.vcl) since Varnish 4.0 looks like this:

sub vcl_hit {
    if (obj.ttl >= 0s) {
       return (deliver);
    if (obj.ttl + obj.grace > 0s) {
        return (deliver);
    return (fetch);

Each object must be made a candidate for grace by setting beresp.grace in vcl_backend_response:

sub vcl_backend_response {
  set beresp.grace = 2m;

What are the benefits?

The critical benefit of this design pattern is that all users are always served from cache, including the first user. The first user to attempt access to certain new pieces of content often pays a penalty in terms of page loading speed. When using this VCL, even though the first user gets an older (stale) version of the content, they will receive it quickly while the new content is fetched in the background.

Indeed, the benefits are three-fold: (i) users receive faster pages, (ii) cached resources are kept up-to-date, and (iii) the update process is minimized by its staying in the background, therefore user experience is not negatively impacted.

Site performance and cache hit rates can be significantly improved, particularly on sites that have a very large number of pages such as an online bookstore or on sites that receive a relatively low volume of traffic. users will experience the benefits at both the browser and edge. The edge benefits from the caching of stale content as users are not slowed down while a response cached at the edge is updated.

To implement this solution within using Varnish 6.0, you just need to set a grace period on the object in vcl_backend_response.

Blog Categories

Interested in articles about a specific topic? Click on a category to see all related content. Sign up

Want to get started improving your website performance, scalability, and security? Sign up for a 14 day free trial of and see what we can do for you!

Get started