Here are my top five CDN myths (for today) and a quick discussion of why they are not exactly true.
Myth 1: CDN Means Static Object Caching
When we discuss CDNs and their function, many engineers consider caching and serving static objects (image, CSS, JS files) as the only function of a CDN. This is an important function of a CDN. Static objects are the easiest to cache on a CDN and certainly help with respect to reducing bandwidth restrictions at your origin servers and reducing computational load at your origin servers. Serving static objects from some location nearer to the user (in terms of Network latency) is also helpful for increasing the page speed.
However, CDNs can (and should) be so much more than just a static object serving network. CDNs are distributed PoPs containing reverse proxy servers. Those PoPs run under some DNS smarts to choose which PoP to send traffic. The type of reverse proxy server contained in the PoP will determine what you can do with the CDN.
Some CDNs are only good for static object caching. However, when you run additional Reverse proxy servers in the PoP, then you can start performing some awesome functions at the CDN including:
- HTML Caching- Massively improving page performance and computational offload from the origin servers
- Blocking Spurious Requests and DDoS attacks – Using Web Application Firewall to screen and block HTTP requests where appropriate
- Blocking Bots and preventing Scraping – Using JS detection Proxies to prevent bots scraping your site or bringing it offline.
- Modifying images as they pass through to reduce weight or resize
- Streaming large files like video
- And many, many more website augmentation options to improve performance security and scalability.
CDN means so much more than static object caching on a network of distributed servers. This is most easily appreciated when we accept that CDNs are comprised of reverse proxy software and that there are a good number of awesome reverse proxy servers which can be run in a CDN PoP. Aside from section.io, CDNs don’t generally disclose what reverse proxies they are running so you won’t really see the “magic” in the black box. But rest assured there are reverse proxies in there driving the website augmentation. The trick is to choose the right CDN which lets you have access to the reverse proxy servers you need.
Myth 2: More PoPs is Better
Back in the old days when we were using dial up modems and 3 and a half inch floppy disks, more PoPs was better. In those days CDNs were the right way to navigate the problems of low bandwidth into and out of hosting centres and between network hops. These days, it makes sense to run PoPs as close as possible to the Internet backbone to take advantage of the reduction in time required to make the multiple hops between network devices. Fewer PoPs also means fewer PoPs to fill with your content. Therefore, you will experience a higher cache hit rate for your content and ultimately more offload and faster retrieval of the content; which is what you were hoping for in the first place.
There is of course a balance to be struck between the number of PoPs and the global spread. Given the average number of requests to build a webpage these days is over 100, it makes sense to reduce the roundtrip time by serving content from as close as possible to the users. Note that closeness in this case should be measured in terms of Network Latency, not proximity (although proximity is a factor contributing to latency)
More modern CDNs provide fewer PoPs closer to the Internet backbone. The next generation of CDN provides you the opportunity to select the number (and location) of PoPs which is right for your application and audience.
Myth 3: You Should use a CDN Domain
Previously in the days of HTTP/1, using multiple domains to serve static content was seen as a means of delivering a page into a users browser more expediently. “Sharding” content away from the main www.website.com domain onto say cdn1.website.com or static.website.com allowed a users browser to bring the content into the browser more efficiently by opening up multiple TCP connections and parallelising the content download across those connections.
This is not the case any more. HTTP/2 means content can be downloaded in a parallelised fashion without opening up more TCP connections. So, opening up those additional TCP connections is actually forcing a cost which is not necessary. The use of multiple domains or sharding, is now a performance anti-pattern.
Some CDNs still force their customers to use static object domains because it is easy for the CDN to deploy in this fashion. This is robbing the website of the opportunity to make use of our modern protocol – HTTP/2.
Myth 4: You do not need to use SSL to Origin
When you set up your website and install an SSL certificate without a CDN in play, the traffic for your website travels all the way to your origin infrastructure in an encrypted fashion. You may terminate at the load balancer and decrypt there or perhaps directly at the web server.
However, when you bring a CDN into play, your content will be terminated and decrypted at the CDN edge so that the CDN can serve the traffic per the CDN configuration and then, where required, relay requests back to the origin severs for the origin servers to complete. The process of terminating and decrypting at the CDN is why you need to install or provision an SSL certificate into a CDN if you want the CDN to serve SSL traffic for you.
But, not every CDN will re encrypt the data on the way back to your origin. This means traffic could be intercepted on the way back the origin from the CDN.
You should have a certificate installed at your origin and your CDN should re-encrypt the requests before sending to your origin servers so that your CDN can maintain a secure conversation for your customers all the way from their browser back to the origin.
Myth 5: CDNs can/should only be a “production” network
All legacy CDNs have been built from a networking perspective. CDN companies have dropped physical infrastructure in data centres and dropped a “black box” of reverse proxy software onto that hardware. Unfortunately for these legacy CDNs, they are now hamstrung by the network structure of their CDNs as the software is difficult to update and certainly hard to share.
CDNs (well actually the reverse proxy servers which run on them) are a part of a website delivery stack in the same way as a web server or a load balancer.
Modern website engineers working in an agile fashion want to have web servers, load balancers and databases running on their local development environment. They also should have the reverse proxy servers running on their development environment.
A software driven approach to the management of reverse proxy servers means that developers can now have (if they use section.io) the CDN running in their local development environment (and test and staging or wherever they want).