When building a cloud application, one of the first decisions to consider is where to deploy it. In fact, Google’s helpful document on Best Practices for Compute Engine Regions Selection states “When to choose your compute engine regions… Early in the architecture phase of an app, decide how many and which Compute Engine regions to use.” Considerations for selecting the regions, according to Google, include latency, pricing, machine type availability, resource quotas and more.
It goes without saying that this is not just relevant for new apps; existing apps that need to scale with a growing user base should take these factors into consideration, particularly if the user base is distributed or even global.
Google specifically notes latency is a “key consideration for region selection” as high latency can lead to an inferior experience. The document walks through the impact on latency of various cloud deployment patterns – including single region deployment, distributed frontend in multiple regions and backend in a single region, distributed frontend and backend in multiple regions, and multiple parallel apps – and discusses various strategies and best practices that can be used to mitigate latency issues, including premium tier networking, cloud load balancing and cloud CDN, local caching and app client optimization.
To be frank, this all has us scratching our collective heads. To begin with, the document is 19 pages long. That’s 19 pages on how to choose where to deploy your app. Moreover, we find it a bit contradictory.
When selecting Compute Engine regions, latency is one of the biggest factors to consider. Evaluate and measure latency requirements to deliver a quality user experience, and repeat the process if the geographic distribution of your user base changes.
Even if your app serves a global user base, in many cases, a single region is still the best choice. The lower latency benefits might not outweigh the added complexity of multi-region deployment.
In other words, latency is critically important, so important that you should repeat this process as the geographic makeup of your user base evolves, but not so important that it’s worth the added complexity of a multi-region deployment.
Actually, in one sense this is a point we agree on wholeheartedly; we’ve written extensively on the barrier that complexity can create when scaling your application. However, we don’t think the right answer is to limit your application to a single region. The right answer is to get rid of the complexity.
We’ve written an entire paper on why organizations are modernizing applications with distributed, multi-cluster Kubernetes deployments. We think it’s critically important, and we discuss the benefits at length, including availability and resiliency, eliminating lock-in, improving performance and latency, increasing scalability, lowering cost, enhancing workload compliance and isolation, and more.
And yes, trying to self-manage a multi-region cloud deployment can be complex. That’s why we don’t make customers do it. At Section, we like to think of this as the de-risk deployment strategy. Or, more succinctly, the “don’t make me pick a region” approach to cluster deployment.
We don’t mean to pick on Google, they are a valued part of our Composable Edge Cloud. And in truth, this is a conundrum for any public cloud platform – deploying to more than one region is hard, yet staying with a single instance is a de facto admission that a portion of your user base will have a sub-par experience.
This is the best-practice methodology for region selection, according to Google’s document:
Now that you have considered latency requirements, potential deployment models, and the geographic distribution of your user base, you understand how these factors affect latency to certain regions. It is time to decide which regions to launch your app in.
Although there isn't a right way to weigh the different factors, the following step-by-step methodology might help you decide:
- See if there are non-latency related factors that block you from deploying in specific regions, such as price or colocation. Remove them from your list of regions.
- Choose a deployment model based on the latency requirements and the general architecture of the app. For most mobile and other non-latency critical apps, a single region deployment with Cloud CDN delivery of cacheable content and SSL termination at the edge might be the optimal choice.
- Based on your deployment model, choose regions based on the geographic distribution of your user base and your latency measurements:
- If user latency is similar in multiple regions, pricing might be the deciding factor.
For a single region deployment:
- If you need low-latency access to your corporate premises, deploy in the region closest to this location.
- If your users are primarily from one country or region, deploy in a region closest to your representative users.
- For a global user base, deploy in a region in the US.
For a multi-region deployment:
- Choose regions close to your users based on their geographic distribution and the app's latency requirement. Depending on your app, optimize for a specific median latency or make sure that 95-99% of users are served with a specific target latency. Users in certain geographical locations often have a higher tolerance for latency because of their infrastructure limitations.
We prefer a simpler alternative. Just deploy to Section, and we’ll take care of the rest.