Improving Real-Time Monitoring Tools for CDN Network Operations Centers
In the past year, Lumen has invested across its CDN portfolio to bring both expanded capacity and new capabilities. In addition to diversifying our offerings (mesh delivery and edge compute to name a few), we have also placed a strong emphasis on improving the customer experience.
This strategic undertaking started with an examination of our internal processes.
Lumen CDN network operations engineers tend to the health of the network through a number of manual and programmatic tools that help identify the sources of anomalies and troubleshoot issues. Finding the proverbial needle in the haystack with so many inputs, however, can prove labor intensive.
To enhance monitoring and facilitate faster response times, we are currently renewing our network analysis tools. One example of this is our Incident Response Tool (IRT). Built using common open-source frameworks such as ElasticSearch, Logstash and Kibana, IRT inputs log streams from the CDN edge and allows us to aggregate and visualize them in a meaningful way.
This offers us request-level visibility, or as I like to call it, “an atomic view of the edge.”
Updated every few seconds, IRT also facilitates sorting and clustering large amounts of network data with filters by country, customer, POP/machines, ASN, etc. Armed with a view that goes from global to granular, network operations engineers can more effectively pinpoint, isolate and drill down on issues to eradicate performance outliers.
For example, IRT can show from which POP traffic is being served down to the very machine that may be malfunctioning and need maintenance. Or, using ISP metrics, we may see that a certain ASN is experiencing abnormally high latency. In this case, we can hone our investigation down to the IP prefixes in question, potentially leveraging our strategic relationships with the provider to help work through the anomaly together.
Request count by ASN breakdown:
Performance by POP breakdown:
IRT is particularly useful for monitoring high-stakes live events where every millisecond matters. It can also be handy for troubleshooting an already-identified issue, as it can help better orient one-off performance investigations. Another application is pre-production configuration tuning, where we work hand-in-hand with CDN customers to understand how a specific configuration behaves on different networks or ISPs before going into production.
All in all, this and other tools under development help Lumen – and our CDN customers – get to that 0.01% of video session anomalies faster. By better automating a portion of the investigation process, we can spend less time combing through data and more time on implementing solutions
While deployment has been limited up to this point, we look forward to rolling out IRT on a larger scale in the coming months. Stay tuned for further updates on our CDN monitoring tools and our operations processes to come. For more information on how you can leverage IRT, drop us a line.
Explore Lumen’s CDN portfolio.
This content is provided for informational purposes only and may require additional research and substantiation by the end user. In addition, the information is provided “as is” without any warranty or condition of any kind, either express or implied. Use of this information is at the end user’s own risk. Lumen does not warrant that the information will meet the end user’s requirements or that the implementation or usage of this information will result in the desired outcome of the end user. This document represents Lumen’s products and offerings as of the date of issue. Services not available everywhere. Business customers only. Lumen may change or cancel products and services or substitute similar products and services at its sole discretion without notice. ©2020 Lumen. All Rights Reserved.