Prior to joining the section.io team, I worked as a consultant focusing on processes and tooling to help software teams deliver the right product on schedule. While there are many facets to achieving this, one approach always contributed significantly to success: deploy to Production (or Production-like environment) early and repeatedly in the development cycle.
One reason why this is effective is because Production environments are typically more complex compared to Development environments. So unless you are developing for your production environment, incorrect assumptions may be made and complexities overlooked. Development environments are rarely configured across seperate physical tiers and usage is typically leaner than Production. Hence, by forcing the team to deploy early rather than discovering issues late in the development cycle, bugs and fixes are much easier and cheaper to address. Especially if assumptions are buried too deep in the code and are hard to tease apart.
It Worked in the Dev Environment!
How many bugs have you seen discovered in staging and production environments where the question “why didn’t we discover this earlier” was answered with “because the dev/test environment is provisioned differently”? In my experience, in almost all cases!
With this in mind, now consider the websites you have helped build where the Production environment is using a Content Delivery Network (CDN), or some other caching proxy (like Varnish Cache) but the environments prior to Production did not have that same service in front?
If your caching solution is only dealing with static assets (eg images, stylesheets) then there may only ever be a few bugs that occur in Production. At the same time, you’re probably not getting the most out of your caching solution.
Once you start leveraging your caching solution’s ability to handle dynamic resources (eg page HTML, AJAX responses) you can deliver great performance improvements to your website and increase the number of concurrent users your infrastructure can handle. However, this may come at a price. Caching rules will now be breaking “simple” assumptions made in development and introducing issues that will only be found in a more “complex” Production environment.
One example I’ve seen recently was a product search results page on an e-commerce site. It was cached so that a user who searched for the same keyword(s) that another user had also searched for would be served a cached response. This would therefore avoid the expensive queries and computations on the origin web server. However, the site had been developed assuming the page would always be loaded from the origin and thus it would update the user’s session data to note their most recent search query. Unfortunately, with the caching rules in place that were in production only, the origin never saw the second user’s search request and did not update that user’s session data. Ultimately this led to the follow-on pages not functioning as designed as the session data wasn’t the same.
When Problems Occur….just TURN IT OFF!
In scenarios like this, where the user experience becomes broken, the fastest resolution is typically to disable the Production-specific behaviour that is triggering the problem. This is how powerful caching solutions atrophy into simple static asset caches.
There are many other scenarios where a CDN can introduce these subtle behaviour changes. An effective cache configuration will be coupled to the nuances of a particular site design and the best result is achieved when the website code is designed to leverage the capabilities of the cache proxy in front of it. It’s a symbiotic relationship.
To design and build a website that can deliver a fast experience to a single user and maintain that same fast experience to many many users concurrently means that your CDN or Varnish Cache deployment cannot be limited to your Production environment. We would also suggest that this outcome is not the sole responsibility of your infrastructure/operations team either.
You need to bring the same caching rules that are used in Production, back into all of your pre-Production environments and keep the cache configurations in sync with Production. Only then will you provide visibility of the cache configurations and the run-time results of the cache to your development team. This will enable the website to be built with an understanding of what will be cached, why something fails to cache, and help with achieving optimal cache durations.
Technology is Only Half the Challenge
As I well know from my time as a consultant, adopting this approach is as much as people-problem as it is a technical challenge. Especially considering the traditional barriers and conflicts between Dev and Ops. With section.io, the technical aspect of achieving this has been done for you.
Every section.io application is provisioned with a Production and Development environment. The configuration of section.io is designed to use Git workflows to keep the environments in sync and promote changes. The section.cli tool enables you to launch the Varnish Cache proxy in your development environment that is the same Varnish Cache proxy used in Production. The section.io Aperture portal gives users access to metrics, logs, and other diagnostics for each environment.
Whatever you’re using as your preferred caching solution for your website, remember to build with the cache in mind. Then verify how the cache affects your site and do this EARLY and OFTEN!