Power of HTTP Logs at the edge
February 25, 2018
When creating a website, you have to set up a lot of moving pieces to actually make it function. For example, creating an HTTP server with software like Nginx or Apache, constructing the application logic sitting behind your web server like a Node app or Magento installation, defining and styling the user facing application with HTML and CSS, and if you have application state you’ll also need to configure a database. Once your application is ready to use you’ll have to start worrying about performance, security, and scalability by using software like Varnish Cache, a Web Application Firewall, and/or a Content Delivery Network.
At the end of the day, you have a lot of moving pieces within your stack that you need to monitor over time to make your application more efficient and performant. You also have to be able to debug the setup in case of an outage or incident. Unfortunately these complex stacks are bound to run into issues over time and you need to ensure you have the tools to narrow down the root cause quickly so you can spend your engineering time fixing the problem, not trying to figure out what the problem is.
When dealing with logs there is a wide variety of contexts you need to account for.
Firstly, there are the metrics from the actual server your stack is running on to ensure it has enough resources to compute responses to the client quickly and efficiently. Secondly, there are the error logs produced by your application in the event of a malfunction, such as your Nginx server crashing. Luckily there are many third-party solutions called Application Performance Monitoring software that capture and organize these logs and can be found in tools like New Relic.
Third, you’ll need to worry about all the access logs generated by your software which can be very useful in narrowing down the root cause of issues and analyzing traffic patterns over time. These can be difficult to digest and manage due to sheer volume. There are also 3rd party tools that digest these logs for you but they can get quite expensive even for the starting pricing tiers.
The ELK stack can be very useful as it provides a log digestion tool in Logstash, a highly efficient document storage location in ElasticSearch, and a user interface to interact with said documents in Kibana.
You can run your own ELK stack instance on a cloud provider like Digital Ocean or AWS, or use one of the 3rd party solutions like AWS Elastic to get up and running with the ELK stack. These solutions are significantly cheaper than some of the 3rd party log digestion tools and also give you complete control over how your application stores it’s logs.
The Section platform provides a hosted ELK stack instance included in the pricing structure for each customer. This ELK stack does not digest your origin logs but comes pre-configured to digest all your HTTP traffic flowing through the Section infrastructure. We log every request hitting our servers with some additional fields like geo-location of the connecting user, all the logs from your proxy stack for a better understanding of what each proxy is doing for you, and also the responses being sent from your origin server. These logs can be very helpful in understanding how your content is being delivered and even assist in the debugging of origin issues at a particular point in time.