Scenario
A couple of weeks ago we were asked by several clients to bring their sites onto our platform to handle large spikes of traffic generated by Shark Tank the TV show. Section were engaged last year to help a number of sites stay online through the Shark Tank experience. We saw first hand that a well tuned cache will keep the site performing super faster with 100% availability under the extreme traffic spikes, while it was reported to us that sites without caching struggled and even went offline. This not only impacted their sales but set a poor first impression for potential investors. Our challenge was to keep the sites not only operational but performing fast to provide a great user experience. This meant we needed to minimise traffic hitting the origin through the use of Varnish Cache caching, as well as offer methods for our clients to monitor and manage traffic.
Preparation
We examined the sites and placed pages into different buckets, cacheable and non-cacheable. Luckily none of the sites had visitor-individualised data inside the HTML, allowing most pages to be cached. A Varnish Cache configuration for each site was then written, taking special care around pitfalls that broke caching such as session management, as well as forms, cart and checkout.
We also setup features in our managed platform such as:
- Overflow management - Only allow a certain number of active users on the site, users over that number is sent to a holding area for a few minutes with a nice Overflow page to look at.
- Multi-CDN for static assets - Give users the best page load speeds by serving static assets from the closest CDN POP.
- Active user monitoring - See number of active users on the site, which pages they are viewing, which region they are viewing from, source of traffic, etc.
- Site speed monitoring - Tracks page load speeds with breakdowns for Homepage, Category, Checkout etc.
- Origin response monitoring - Logs the HTTP responses from the origin to be able to easily spot issues.
- And more
On the day
Our engineers monitored our dashboards with the aforementioned monitors to help the clients if needed. As each business was featured, we saw a huge volume of traffic hit the sites, over 900% increase for every site as compared with previous maximums. You can see in the graph below the sudden surge in HTTP responses served by our platform.
Results
All customer sites performed well throughout the event. For all sites we saw HTML hit rates of at least 70%, substantially decreasing working load on the origin. Here is one of the better performing sites with a HTML hit rate of 85%
Keeping in mind that a large portion of the misses are intentional as some pages were not able to be cached, for example Cart or Checkout.
Static asset offload was also a huge win. Our front line multi-CDN cache served over 80% of the static requests. And of the misses, up to 93% were served from our secondary caching layer, which means very minimal number of static requests had to be handled by the origin.
From a user experience perspective, the sites maintained their median page load speeds throughout the event. You will see in the graph below, that when the throughput increased, the median and mean page load speeds became more consistent and lower due to the cache being well populated.
The Overflow feature was not required at any stage due to the high amount of offload provided by our caching layers. It was reported by our clients that even during that largest spikes, their origin did not exceed 50% usage for any resource, up from 20% during normal operations. This was a great result as resource usage on the origin only increased by 30% while actual traffic increased by orders of magnitude.
If you’d like to know more about how Section can help your site, then sign up, join a weekly webinar, or contact our engineering team.