Applying closed-loop automation in broadband networks

Sept. 21, 2021
Dynamic network optimization can deal with the presence of heavy users on a passive optical network (PON) and restore the other users’ capability to use peak capacity, without disturbing the download performance of the heavy users.

With a continual increase in service level expectations, network automation is a vital step in the digital transformation of an operator’s business. Multi-generational, multi-technology networks have led to complex operational processes with increased data sources and volumes, heightening the need for comprehensive data management and automation strategies to reduce human error and boost productivity.

Software-defined networking (SDN) helps to create more self-aware, self-governing networks that can apply artificial intelligence and machine learning (AI/ML) to automate operations, improve service assurance, and provide detailed analysis for anomaly detection, action recommendation, and capacity planning.

In this article, we’ll look at how automation improves network performance through smart service level agreement (SLA) management. We will show how dynamic network optimization can deal with the presence of heavy users on a passive optical network (PON) and restore the other users’ capability to use peak capacity, without disturbing the download performance of the heavy users. Sounds contradictory, but it’s possible. 

Software-defined access networks

Central to software-defined access network (SDAN) architectures is the SDN controller, which offers automation that can be adapted for a vast permutation of needs, services, and processes. Analytical engines can use SDAN’s open interfaces to retrieve real-time and historical data from central data lakes, while the closed-loop automation (CLA) framework enables zero-touch provisioning, network health analysis, and automated troubleshooting.

The open APIs enable operators to build their own automation apps on top of the SDN controller. As SDN controllers enable dynamic determination of threshold crossing alerts and KPI-based monitoring, the analytic engine can, in turn, trigger CLA routines for corrective actions, thus automating the loop (Figure 1).

The challenge of PON bandwidth management

According to data published by Openreach, internet traffic in the UK more than doubled from 2019 to 2020 and some studies predict that it will exceed 1 TB/subscriber/month by 2025 (Figure 2).

The challenge for communication service providers (CSPs) in such a scenario is that network capacity and corresponding infrastructure investments are planned over many years while demand is booming and as we’ve just seen with COVID-enforced lockdowns, can change almost overnight.

The notion of SLAs is also evolving, as consumers are now as dependent on their broadband connection as businesses are. “Best effort” services are becoming less acceptable to customers, or regulators. So, it’s getting increasingly difficult to dimension the network (determine the maximum number of subscribers per PON) as well as to determine the network configuration parameters like the maximum rate per subscriber (advertised peak rate).

Overbooking allows CSPs to advertise more throughput by relying on the fact that not all the subscribers will request the maximum throughput at the same time. However, to ensure acceptable service levels, overbooking has to be carefully managed.

The sharing of a PON among multiple users is typically handled by traffic schedulers and shapers. In this sharing scheme and without service differentiation, every active user will receive an equal share of the total bandwidth. The free bandwidth is automatically redistributed among active users. In a worst-case scenario when all subscribers are active, the minimum bandwidth that each user will receive is equal to the total bandwidth divided by the number of subscribers. In the best case, if there were only one active user, this user would get the full bandwidth.

CSPs make sure that all subscribers can access services, like video streaming, all at the same time, and that significant extra bandwidth is available on top to address best effort services like web surfing. As user activities, at least for the best-effort applications, are totally uncorrelated, it is very unlikely that all of them request a high throughput at the same time, allowing significant overbooking with limited impact on the service level. 

One of the limitations of the PON, especially when overbooking is introduced, is vulnerability against heavy users. Long-term subscriber fairness is hard to obtain, as at any time, the bandwidth is fairly shared among the active users, irrespective of historical bandwidth utilization. This means PONs are vulnerable to people using their subscription to the max: the Netflix 8K streamer, the dubious downloader, the home-based 3D animator, or when using a residential subscription for commercial services. If a single user requests the maximum bandwidth permanently, it can prevent other subscribers from getting high bandwidth even for a short period of time. 

Historically, CSPs have used rules-of-thumb to do traffic management. However, traditional ways of managing heavy users (throttling, data caps, or premium pricing for higher SLAs) are not dynamic; once applied they are permanent until unapplied, even when extra bandwidth is available or no other subscribers are affected, say during night-time or low traffic periods.

CSPs are caught between increasing advertised peak rates while increasing the overbooking levels or keeping them at a conservative level to guarantee SLAs and limit the impact of heavy users.

Applying online and offline network optimization

Using CLA, we can implement dynamic bandwidth management that preserves peak rate availability even in the presence of heavy users. It ensures all subscribers within a PON (or a slice of a PON) have access to a “fair” portion of available bandwidth.

SDN takes advantage of telemetry—network data being streamed continuously rather than being requested periodically—to understand traffic patterns in real-time and compare them with historical bandwidth consumption measured at different levels in the network (subscriber, PON, operator, etc.). This ability enables bottlenecks to be identified and autonomously remediated in near-real time, which is impossible to do in traditional systems. Achieving this goal requires a modern and efficient approach to network data capture and telemetry streaming via protocols like IPFIX and Kafka.

If insufficient capacity is detected, CLA reduces the bandwidth for heavy users based on their historical consumption by dynamically adapting the traffic schedulers and shapers. When more capacity becomes available, CLA restores the initial configuration. This practice guarantees long-term fairness for subscribers while maximizing bandwidth utilization. To illustrate the performance in a realistic scenario, one heavy user can deny all other users to complete a successful speed test; but when the CLA is activated, we restore the probability of a successful speed test to 80% for all users, while the heavy user’s download times are increased by less than 1%, which is hardly a performance degradation.

While the bandwidth closed-loop optimization operates online, telemetry can also be leveraged in offline AI/ML tools that are able to learn the traffic patterns from the network and recommend optimal advertised peak rates and the number of subscribers per PON for a desired SLA (Figure 3).

For example, an ML model can be trained to determine the number of subscribers per PON that corresponds to a minimum speed test success probability and a given advertised peak rate. Alternatively, an ML model can also be trained to determine the peak rate that corresponds to a certain speed test success probability and a given number of subscribers per PON.

The same overbooking/heavy user problem exists when a single network is shared by different virtual network operators. In the same way, virtual operators or infrastructure providers can use CLA and offline recommendation tools to dynamically dimension both the physical PON and the virtual slices according to real-time usage and historical trends.

Conclusion

We expect SDN automation to make important decisions for broadband operators at an accelerating rate: unlocking new service capabilities, bringing more agility to their operations, making for smarter decisions, and improving network performance.

Filip De Greve is product marketing director, Fixed Networks, at Nokia.

About the Author

Filip De Greve | Product Marketing Director, Fixed Networks, Nokia

Filip De Greve is product marketing director, Fixed Networks, at Nokia.

Sponsored Recommendations

AI and Network Convergence: Transforming Global Connectivity

March 7, 2025
In today’s hyperconnected world, rolling out and managing profitable, high-performance networks for access and transport will require innovative architectural approaches. The ...

Transforming the metro network and the evolution of the "Digital Service Provider"

March 4, 2025
Join experts at EXFO and Ekinops in this webinar that will review the evolving metro-centric requirements and the technologies emerging to meet them.

Unveiling the Synergy Between AI and Optical Networking

March 12, 2025
Join us for an engaging discussion with industry experts on the intersection of AI and optics. Moderated by Sean Buckley, editor-in-chief of Lightwave+BTR, this panel will explore...

On Topic: Tech Forecast for 2025/ What Will Be Hot

Dec. 9, 2024
As we wind down 2024, Lightwave’s latest on-topic eBook will examine the hot topics for 2025. AI is at the top of the minds of optical industry players supporting...