Home | Blog | Drive Your Network Forward with KPI-Based Anomaly Detection

Drive Your Network Forward with KPI-Based Anomaly Detection

For many years network management and optimization have followed a traditionally trusted path, entrusted by dedicated engineers who monitored its day-to-day performance based on main KPIs such as availability, quality, and retention. As new iterations of the network were introduced, moving from 2G to 3G to 4G, etc., the services offered to end-users have expanded significantly. First came voice and SMS services, then data, which was followed by video streaming. Each iteration has also considerably increased speeds, with 5G expected to be up to 100 times faster than 4G.

As networks evolved, they became more complex and started generating more significant amounts of data, making it impossible for engineers to track manually. With a large number of metrics that can be measured, operators have accumulated a broad dataset to explore in order to optimize the performance of their network.

To meet the demands of handling such large volumes of data while delivering a quality customer experience with so many metrics measured, operators should adopt an automated AI/Machine Learning (ML) anomaly detection process that can provide accurate, real-time KPI-insights. With anomaly detection, operators can assure their network’s high-quality experience, discovering critical data deviations before they negatively impact their subscribers.

What exactly is anomaly detection, and why do operators need it to optimize their networks’ performance?

Anomaly Detection: Predict Your Network Behavior

Anomaly detection is a method of searching for data that does not match an expected behavior or a pattern in a given dataset. In other words, an anomaly is a deviation from business as usual. With anomaly detection, one can process data faster and more efficiently, detecting abnormal events, changes, or shifts in existing datasets.

As an operator, you usually track many metrics such as channel utilization, TCP retransmission, Round Trip Time (RTT), jitter, latency, packet loss, availability, and others, which are then recorded and analyzed. Each of these metrics is measured based on a specific time series with a normal baseline of behavior. If a particular data point deviates outside the usual range expected for a particular metric, then that data point is considered an anomaly. This process enables you to identify and even predict network behavioral patterns, providing useful insights into your data.

Anomalies could cause system-wide degradations in the network, impacting the end-users’ quality of experience and possibly affecting your revenue stream. On the other hand, a data anomaly could be a sign of an opportunity to learn and predict the network’s needs, auto-optimize its performance, leading to improved user satisfaction.

Furthermore, successful anomaly detection hinges on an ability to accurately analyze datasets in real-time and enhance root-cause analysis. It enables you to track and uncover exceptions in critical KPIs to proactively address network performance in real-time.

Using Machine Learning to Automate Anomaly Detection

Machine Learning (ML) is a specific subset of AI that focuses on providing systems with the ability to learn and improve from experience without being explicitly programmed. It focuses on finding patterns and features in vast amounts of data and using it to learn and draw conclusions for itself.

ML has a vital role to play in helping you to monitor these KPIs proactively and rapidly troubleshoot issues. Anomaly detection relies on ML algorithms that seamlessly correlate data with relevant KPI metrics to provide a complete story for predicting the network’s behavior. These algorithms identify data patterns over time and across datasets and learn from them. They can also make predictions based on that data to forecast future issues, achieve a high degree of network automation, and mimic the human-decision making process by auto-responding to network events.

Using ML, KPI-based anomaly detection allows an operator to receive a nearly real-time alert from sudden network degradation counters. When anomalies are detected, either network engineers can be alerted, or automated network actions can be triggered immediately. This will free engineers from monitoring these critical KPIs manually, enabling them to focus on other high-priority issues. Different departments within the operators’ organization, such as the NOC, listen and receive these critical alerts and proactively correct network degradations before the subscribers even become aware of them.

The most efficient way to implement anomaly detection is by using an automated assurance solution that applies the AI/ML to data, as it is collected for assurance purposes.

This next-generation assurance solution should be based on machine-learning-friendly modular architectures, with advanced ML models seamlessly integrated at its core, such as Prophet for forecasting time series data (built and open-sourced by Facebook). The ability to choose from various algorithms and ML models so that the right ones can be used for different use cases is also essential. Having such an assurance solution with built-in ML anomaly detection will enable you to:

Analyze the immense amounts of data generated by the network, which no human can do.
Rapidly identify performance issues across all services.
Receive alerts if a specific KPI breach occurs.
Identify patterns over time and group various anomalies to perform automated root-cause analysis in case an issue arises.
Optimize the network in the most efficient way without the need for manual adaptation, accumulating experience over time.

Summary

In conclusion, to ensure optimized network performance, operators will need to use an AI/ML KPI-based anomaly detection process, correlating all datasets and auto-detecting deviations in the network behavior. Such an anomaly detection process is part of our advanced assurance solution, RADCOM ACE, which will allow operators to gain near real-time notifications of KPI-based anomalies, automatically analyzing data over an extended period, to assure an improved network performance for their subscribers.

While operators have been laser-focused on optimizing data collection, now it’s time to learn how to wisely use this data to gain in-depth insights that will drive your network forward. RADCOM ACE provides operators with an Automated, Containerized, and End-to-End network assurance solution. Its built-in, ML-driven anomaly detection process allows teams to control the amount of data delivered to them, so they don’t become overwhelmed and focus on top priority tasks and truly customer-affecting issues. Consequently, our solution assures a superior customer experience and operational excellence, turning problems into opportunities, far outweighing the past networks.

To learn more about RADCOM ACE for performing KPI-based anomaly detection, download our white paper about AI Driven-Assurance for 5G.

This article is subject to Radcom’sdisclaimers regarding Forward-looking statements and general information under the links below:

Radcom’s Forward-looking statements disclaimer

Radcom’s General information disclaimer

Yariv Waits

Product Lead Data Analytics