How AIOps and ITPA Can Help Telecom Companies Manage the Challenges of 5G

Contributed by Amer Alsharkawi, Regional Sales Director META & APAC regions, Resolve Systems

How Telcos Can Rein in 5G Challenges with AIOPS and IT Process Automation 

 

As 5G technology advances in power and sophistication, telcos are encountering a new span of challenges. 5G network complexity, coupled with a spiraling volume of data traffic, raises barriers for telco companies seeking to smoothly manage their networks and sustain consistent quality of service (QoS). But a solution is now easily within reach. Artificial intelligence for IT operations (AIOps) technology, combined with IT process automation (ITPA), can come together to streamline telco network operations, minimize downtime, and dramatically boost network efficiency overall. 

 

Solving the Impact of 5G Network Management   

 

The rapid—even explosive—progress of 5G brings on several management challenges. Network complexity tops the list. That’s because along with 5G advantages and enhanced value comes a denser infrastructure and multiple additional components—overshadowing previous generations. What are the consequences of this additional baggage? A dramatic spike in alerts, upsurges in data, and frustrating complexity in incident management. 

 

Additionally, as 5G data traffic takes off, speeds rise and latency falls. This makes for an exciting, positive environment for users—now able to consume and generate data at unprecedented rates. However, for telecos, the impact of this increased traffic is a concurrent scramble to keep pace and avoid congestion and network downtime, particularly during peak usage.   

   

The User Challenge: Maintaining QoS at the Pace of 5G  

Meet Resolve Systems on March 20th in Dubai at both #4 at FutureNet MENA.

 

Finally, telco companies must face the most unrelenting demand of all, Quality of Service. This influences business outcomes and 5G ROI. In a complex, dynamic environment, users and applications exert pressure on the network as never before, and QoS issues can damage the bottom line. The urgency of maintaining QoS in a 5G world is critical for supporting customer satisfaction and preventing churn. That vital consistency and quality demand proactive monitoring and blazing fast incident response. Enter AIOps and ITPA.  

 

A Real-World Telco Company Confronts Intense 5G Challenges   

 

To illustrate the beneficial impact of AIOps and ITPA in 5G telco, let’s explore a real-world scenario. Imagine a telco has recently launched its 5G network in a major metropolitan area and is now confronting the expected challenges: mounting network complexity, soaring data traffic, threatening congestion, and consumer expectations for seamless QoS. 

 

Challenge 1: Beating Back Mounting Complexity 

 

To overcome complexity, this telco wisely chose to implement AIOps, putting the power of artificial intelligence into identifying patterns and anomalies in its network alerts. With its speed and intelligence, AIOps can instantly analyze data from multiple sources, including network logs, performance metrics, and user experience data. This critical information enables the IT team to detect issues proactively—before they take their toll on the network. AIOps also prioritizes incidents based on their severity, helping the team gain valuable time and visibility.   

 

Challenge 2: Untangling Data Traffic Congestion 

 

Moving into action to address the threat of data traffic congestion, the telco company now calls upon ITPA to automate routine network optimization tasks—saving time and streamlining management of the mounting data traffic. ITPA reads and identifies patterns in data traffic and adjusts network resources accordingly. With resources deployed optimally, network resources take the shortest, smartest routes to peak performance, smoothing and calming the risk of congestion and downtime. 

 

Challenge 3: Assuring Network QoS 

 

Putting AIOps and ITPA to work collaboratively on their common goal of maintaining network QoS, the telco company can now proactively and transparently solve potential issues before the user is even aware of a problem. AIOps and ITPA avert QoS issues by monitoring network performance, applying AI intelligence to preemptively safeguard QoS. AIOps is able to analyze user experience data and identify potential anomalies in network performance, enabling the IT team to resolve issues before they impact customers. 

 

Teamwork Merges the Talents of AIOps and ITPA to Optimize IT Ops 

 

By combining the unique capabilities of AIOps and ITPA, telco companies improve their network operations, tackling the challenges of 5G head-on. AIOps enables the telco to identify and prioritize incidents, proactively monitor network performance, and analyze data to identify abnormal patterns and deviations. This accelerates incident resolution and improves network efficiency drastically. 

 

ITPA, for its part, automates routine network optimization tasks—necessary but burdensome—tying down IT staff and affecting not only efficiency but the employee experience. With ITPA, the IT team can bring its talents to bear on more complex and potentially consequential, higher-level issues. This intelligent use of both human and technology resources minimizes the risk of errors and maintains consistency in the complex environment by taking over and handling incident response before it affects the environment and, therefore, customer satisfaction, employee loyalty, and the bottom line.   

 

Applying AIOPs and ITPA to Their Full Potential  

 

To fully leverage and combine the unique capabilities of AIOps and ITPA, telco companies should implement these best practices: 

 

Deploy a unified solution: An integrated AIOps and ITPA solution offers the value, accuracy, and convenience to manage network alerts and incidents. AIOps and ITPA deployed together reduces the need for multiple tools and enforces standardization and interoperability across the incident response process. 

 

Identify key use cases: AIOps and ITPA have proven to be ideal for specific use cases, and it’s smart to identify those and be ready to implement them. These include proactive network monitoring, incident response automation, and network optimization. By identifying these key use cases, a telco can prioritize and direct the proper human and technology resources to the most critical issues, applying AIOps and ITPA where they can deliver the highest benefits.   

 

Focus on data quality: AIOps and ITPA can only contribute accuracy and effectiveness when they can access optimized data to analyze. To ensure the best outcomes, telco companies should prioritize delivering quality data by means of data cleaning, data normalization, and data enrichment. 

 

Embrace change management: Implementing AIOps and ITPA successfully in a 5G environment requires an insightful shift in mindset and processes. Decision-makers should examine and prioritize change management to ensure that personnel and teams are trained and knowledgeable on the new tools and processes demanded by 5G transition, as well as fully invested in the effort.   

 

Ethically audit and validate algorithms: AIOps algorithms are notoriously vulnerable to absorbing bias reflected in the data used to train them. To ensure accuracy and ethical use of AI, and to avoid potentially harmful influences or actions, telco leadership should examine, audit, and validate AIOps algorithms to the highest standards on an ongoing basis. 

 

Close the Loop with AIOps and iTPA 

 

AIOps and ITPA are essential resources for managing the numerous and detailed challenges of 5G network management. Telco companies should rely on AIOps to proactively monitor network performance, prioritize incidents, and analyze data. Concurrently, they should turn to ITPA to automate routine network optimization tasks. By combining the respective capabilities of these two key technologies in the 5G landscape, telco companies can ensure that their network operations are seamlessly efficient, and that they benefit individual users and the organization. By following best practices, telcos can fully leverage the capabilities of AIOps and ITPA to ensure optimal management of 5G challenges, providing consistent QoS and a high-quality customer experience. 

 

To learn more about how our Silver Sponsor, Resolve Systems, solves network complexity for Telco companies, request a demo or visit us on March 20th in Dubai at both #4 at FutureNet MENA.  

The Future of Network Service Assurance: Real-time Visibility, AI, and Cloud-native Architecture

Contributed by Matthew Twomey, Head of Product Marketing & Marketing, Anritsu Service Assurance

The rapid evolution of network technologies and the increasing complexity of services have transformed the landscape of network service assurance. In the era of 5G and beyond, operators are faced with escalating demand for new business-focused applications, improved service quality, and real-time issue resolution. The future of network service assurance lies in real-time visibility of subscriber/device issues, the application of AI, and adapting to cloud-native architecture amidst changing 3GPP standards. This article will discuss these crucial aspects and differentiate between service assurance from a fulfilment perspective and service assurance from the network.

 

Real-time Visibility of Subscriber Issues vs. Real-time Visibility of Network Functions

Matthew Twomey is speaking at FutureNet MENA, 20 March, Dubai

Traditionally, network service assurance focused on monitoring network functions and infrastructure to ensure seamless connectivity. However, as networks become more complex and user expectations rise, the focus has to shift toward real-time visibility of subscriber issues. This entails understanding the customer experience, detecting and resolving service issues in real-time, and proactively addressing potential problems before they affect the end user. This shift in focus is essential for maintaining customer satisfaction, reducing time spent troubleshooting, and ultimately, ensuring business success.

 

The Application of AI in Real-time

Artificial Intelligence (AI) is poised to play a pivotal role in the future of network service assurance. With the increasing complexity of networks, manual monitoring, and intervention have

become less efficient and less effective. Assurance-AI can analyse vast amounts of data generated by the network infrastructure and user devices to identify patterns, predict potential issues, and prescribe actions to optimize network performance. This real-time AI-driven analysis, looking at subscriber-level detail, allows service providers to detect and resolve issues faster, improve network efficiency, and enhance the overall customer experience.

 

Fulfilment Perspective vs. Network Perspective

Service assurance can be viewed from the fulfilment perspective and the network perspective. From a fulfilment perspective, service assurance focuses on the end-to-end process of delivering services to customers, including order management, provisioning, and activation. This viewpoint emphasizes the seamless delivery and activation of services to meet customer expectations. In this realm, real-time AI, though as important, has fewer challenges for visibility.

On the other hand, service assurance from a network perspective needs to acquire its data and is concerned with monitoring and maintaining the health and performance of the network infrastructure. This includes identifying and resolving network issues, ensuring availability, and optimizing network performance. Of course, the subscriber experience is the foundational part of this view.

Both perspectives are crucial for a comprehensive approach to service assurance. Understanding and addressing service assurance from these two angles ensures that service providers can effectively manage network resources, deliver high-quality services, and maintain customer satisfaction.

 

Challenges in Moving to a Cloud-native Architecture

The transition to cloud-native architecture is essential for operators to keep pace with the increasing demands of businesses and subscribers while supporting technologies like private networks. However, this move comes with its own set of challenges. One significant ongoing challenge will be the adoption of evolving 3GPP standards, which continue to redefine network specifications. These escalating standards introduce escalating complexity for operational teams, requiring them to adapt to new tools, protocols, and processes. This necessitates a continuous learning process, ensuring that teams have the skills and knowledge to manage and assure cloud-native networks effectively. This further cements the need to look at AI-assisted Assurance systems today.

Another challenge for future networks is capturing the data needed to understand subscribers’ experiences as they always have. 5G SA and future versions have security built-in to the heart of the network. This makes data acquisition more difficult. If you don’t understand the subscribers’ experience, you cannot make a network corrective action either manually or automatically, as you won’t know the impact on subscribers.

In conclusion, the future of network service assurance calls for a paradigm shift from mere real-time visibility of network functions to real-time visibility of subscriber issues. This approach will facilitate a more customer-centric model, enhancing user experience and driving automation. AI’s real-time application will play a crucial role in addressing the increasing complexity of network management and enabling proactive troubleshooting. As the industry transitions to a cloud-native architecture and adapts to the ever-evolving standards, operational teams must navigate the escalating complexity accompanying these changes with a shift to becoming an automation-driven operational team.

A competitive cost base is Orange’s end-game, not short-term cost reduction

Yves Bellego, Director Network Strategy at Orange

Yves Bellego of Orange talks to Contributing Editor Annie Turner about the group’s next big, experimental step towards the fully automated network.

Yves Bellego, Director Network Strategy, has worked for the Orange group for almost 30 years, in which time he’s seen a lot of change. Like life in general, things inside telcos don’t always turn out exactly the way we expect, and looking back to the beginnings of network automation with NFV, he reflects that too much emphasis was put on cost savings from the start.

All kinds of extravagant claims were made: for example, research carried out by ACG Research in 2015 reckoned mobile network operators could reduce capital expenditure of the mobile packet core by 68% and operating expenditure by 67% through the use of NFV over a five-year period.

Bellego says that cost savings should not have been and should not be a short-term driver. In his view, “There were two drivers [for NFV and telco cloud]. One was just the technological improvement, the evolution, of servers that make it technologically feasible to put functions on more harmonised servers. The second driver is flexibility, operational efficiency if you like, because those are the benefits we saw from very early deployments.”

Bellego also thinks how much early virtualization efforts moved things on is underestimated too. He says, “the scalability effect on capacity is so much easier within the telco cloud than it was with dedicated hardware servers.” He adds, “What is coming and what we expect are the capabilities that will be given by the availability of more data. This is in fact something that we are just starting to test.”

A blueprint for future networks

Orange Lannion Lab

Bellego is referring to the experimental 5G network that Orange announced at the end of June. The operator has built the network at its famous R&D centre in Lannion in Brittany, France.

The cloud-native, software-defined network, known as Pikeo, is to be developed as a blueprint for future networks, based on Open RAN and cloud architecture.

The motivation for building Pikeo is to gain a deep understanding of exactly how the end-to-end 5G Standalone cloud-native network – which Orange says is the first in Europe – will perform in the real world. The ultimate goal is to identify the best network architecture to deliver ubiquitous, flexible and automated connectivity that adjusts to any given service, to the particular user and the specific situation in an area of coverage.

Put another way, Orange is working to find out how AI-enabled, zero-touch networking functions, and to understand the full implications for a large, complex software-enabled, Kubernetes-based network.

Introducing services

The operators plans to deliver 5G services to Orange employees in Lannion starting this summer, then extend this to both employees and some Orange customers over the next two years, gradually increasing this to several hundred users. The operator plans to deploy the network in other locations in 2022 and start testing network applications running on network slices.

When Pikeo was announced, Bellego’s colleague, Michaël Trabbia, Chief Technology and Innovation Officer at Orange, said staff who now monitor networks and deal with alarms and trouble-tickets will become AI experts and contribute to improving the algorithms so that the network becomes increasingly proactive, instead of reactive.

As Bellego says, “5G Standalone is one of the bigger tracks on which we can benefit from the cloud and have the capability to set up containers or virtual machines to easily ramp up and down there. So that should help to set up and remove network functions pretty easily, but that is not the only benefit – having much more efficient optimization algorithms based on data coming from lots of different sources is another big trend that we see in the coming months.

Bellego says, “So with functions being in the cloud, from the radio to the core, the interesting thing is when we put the AI algorithms that can work on data from the radio and from the core, and from the transmission and from the IT. This is where we expect to have improvements in the operational efficiency and in the end, on the performance of the network.”

What we have today

He explains, “Today we have C-SON – centralised self-organising – it’s a kind of automation on the network with a little bit of AI inside it, but the limitation of that is it’s mainly and almost only on the radio mobile access. It does not encompass mobile access and core, and what we are just starting to test is optimization algorithms that encompass all of the network, the radio, but also the backhaul, the IP, the core network, and we can include some customer data. Then we will have capabilities that we do not have today.”

“In terms of network performance…this is when we will get the full benefit of the cloud [and] we will have better cost structure compared to our competitors and compared to the past. From a customer perspective, what we have achieved on the B2B side a few years ago with software-defined networking – which was a great improvement, [through] automation –we expect to have similar benefits for our entire customer base.”

Regarding the telco cloud infrastructure, he continues, “what applies to the cloud is the same as what applies to all elements of the network. The same on the radio, we could also say we want to own radio with specifications that differ from the specifications of Telefonica or Deutsche Telekom or Vodafone, for example, but we realise that that does not work. We are in an industry where economies of scale are of great importance and we need to have some commonalities of specifications.”

Bellego concludes, “We do not want to have a network that is totally different from that of Telefonica, say. We are in the same race, but at the end, we want to differentiate on the way we use a deployed network, on the tools we put on it and the architecture we put in place. This is our way to compete.”

The Path to 5G SA Automated Assurance

Contributed by Matthew Twomey, Head of Product Marketing, Anritsu Service Assurance

5G SA is here. CSPs (communication service providers) around the world are planning, trialling, and deploying 5G SA Technology in their networks. The GSA (Global Mobile Suppliers Association) identified* 79 operators in 42 countries worldwide that are investing in public 5G SA networks (in the form of trials, planned or actual deployments). There are 12 CSPs who have launched 5G SA services to customers around the world. In addition, there are currently 311 organizations deploying 5G SA for private networks.

To ensure robust 5G SA services for subscribers and businesses, CSPs are investing large sums in their 5G networks, business and operational support systems. The key investments are in spectrum, cloud resources, and domain expertise. The potential for new business models in private networks, network slicing, AR/VR, fixed wireless access, cloud gaming, smart cities, and more means this is not business as usual. These new services will drive additional revenue streams and hence each of these services will need to be monitored and assured (from a performance & quality perspective) for speed, latency, and device density.

To back the investments, you must be able to manage and control your 5G SA network. The key is Visibility. Visibility is required into the different logical layers in 5G SA, the newly separated control and user plane, the high dynamicity of the cloud network topology and network functions, and the services available in a service mesh architecture. The ability to decrypt the 5G SA Core encrypted control plane is also required to enable visibility. With the increased levels of complexity and real-time changes in the network, Visibility in Real-Time is a MUST in the 5G SA service assurance fabric.

Given those 5G SA visibility challenges, a CSP ought to understand the characteristics required of a 5G SA service assurance solution. For example,

  • The service assurance solution must be cloud-native and support cloud-native infrastructure.
  • The solution needs visibility and understanding of the full stack, from the cloud infrastructure layer to the containers of the network functions and must include the interfaces between those network functions.
  •  It needs to be highly adaptive to process the frequent changes in the cloudified network.
  • An automation-first approach is essential with a real-time anomaly detection application driving closed-loop network corrective actions.

Network functions are in containers in 5G SA. They come from multiple software vendors, work according to interpretations of GSMA specifications, and so the internal workings of each network function will differ. Edge cases will arise around functionality, capacity, and latency and these will drive issues. Here are a few examples

‒      When there is a service affecting issue with a network function, will that network function prioritise addressing the issue, and will it report the issue.

‒      When the network function fails to whom does it report or to whom must it report?

There is an escalating chance of service-affecting issues and perhaps a cascade of failures between containers. The massive investment, in cloudified networks, compels CSPs to monitor and assure interfaces, infrastructure, services, and customers in the 5G SA services assurance fabric.

Domain expertise in Service Assurance matters and it matters more in 5G SA with the network based on cloud, disaggregated, dynamic with escalating complexity, escalating data volumes, and internal east/west traffic. The Anritsu Service Assurance fabric captures data, in context, to create structures of intelligence aimed at providing value and use cases. Developed over 20 years of telecoms service assurance domain expertise, our goal is to use data in a smart way. With 5G SA, our assurance strategy is about leveraging AI/ML-based Real-Time Anomaly Detection first and complement it with details and insights that are typically needed for Network & Service operational teams. Our solution portfolio of performance analytics, troubleshooting, and customer experience solutions work together seamlessly in 5G SA and with legacy technologies.

The Anritsu path to 5G SA service assurance has been in the making with years of strategic investment in virtualization, 5G domain capabilities, and a company-wide digital transformation. Based on the commercial contracts awarded we have started deploying the Anritsu 5G SA service assurance solution for two world-renowned Tier-1 Operators. This is an immense endorsement of our strategy, our investments, and the trust earned as a long-term partner for cloudified future. Anritsu looks forward to working with its current customers, new customers, and partners to expand the Service Assurance possibilities of 5G SA, cloudification, and digital transformation.

Come talk to our experts on our LinkedIn Showcase page:

https://www.linkedin.com/showcase/anritsu-service-assurance

*GSA: 5G Stand Alone Global Market Status: Executive Summary June 2021

News in Brief: Hard to tell where telco ends and cloud begins

Contributing Editor Annie Turner rounds up the latest automation highlights.

Google announced it has joined the O-RAN Alliance to help drive and accelerate the realization of O-RAN initiatives using its expertise in a blog. It outlined five areas in which it feels it can contribute, including AI for autonomous and self-healing networks. The blog said, “digital transformation will require architecting, designing, and deploying intelligence across a distributed cloud network that is fundamentally powered by AI and closed-loop automation. Our vision is to work with the O-RAN Alliance to enable cloud-native intelligent networks that are secure, self-driving, and self-healing ­– bringing Google’s wealth of software experience and global leadership in the areas of machine learning, massive data processing, and geospatial analytics.”

Google Cloud is to partner with Ericsson to jointly develop 5G and edge cloud solutions to help operators with their digital transformations and unlock new enterprise and consumer use cases. They are building on the experience gained in Italy with TIM at Ericsson’s Silicon Valley D-15 Labs innovation center where solutions and technologies can be developed and tested on a live, multi-layers 5G platform.

Photo by Michele Bitetto on Unsplash

At the same time, Italian operator TIM and its cloud division, Noovle, said they will launch the first 5G cloud network in Italy. The operator reckons this will lead to faster deployment of 5G applications through the automation of industrial processes and the implementation of services in real time, thanks to edge computing, based on specific requirements. The project will use TIM’s Telco Cloud infrastructure, Google Cloud’s solutions, and Ericsson’s 5G Core network and automation technologies.

According to TIM, the solution will enable, “Faster deployment of the 5G digital applications through the automation of industrial processes and the implementation of services in real-time, thanks to edge computing, based on specific requirements”.

IBM is expanding its portfolio of automation software for operators with offers designed to help them to stand up and manage 5G networks faster (in minutes, not days), on-premise or in the cloud. The IBM Cloud Pak for Network Automation provides automation tools for implementing 5G and edge services that manage multivendor, software-based network functions. The Pak includes analytics to help operators discover hidden patterns and trends in their 5G network data. The offer integrates with IBM Cloud Pak for Watson AIOps and IBM Edge Application Manager to allocate network bandwidth and resources dynamically when and where required.

IBM cemented a further deal with Verizon. The US operator chose IBM and its subsidiary Red Hat to provide a hybrid cloud platform for its 5G network. IBM and Verizon have a long history of collaboration. Steve Canepa, Global GM & Managing Director, IBM Communications, said, “I’m delighted to announce the next major step in our partnership. Verizon has chosen IBM and Red Hat to help build and deploy an open hybrid cloud platform with automated operations and service orchestration as the foundation of its 5G core.”

Meanwhile, Orange said it will launch “the first” experimental, fully end-to-end cloud-based 5G Standalone (SA) network at Lannion, in Brittany, France, next month. The operator added that this is the blueprint for future infrastructure.  Richard Webb, Director, Network Infrastructure at CCS Insight, commented, “This is a significant announcement by Orange on several fronts: firstly, the fully cloud-based aspect, not only indicating Orange’s roadmap towards fully automated, software and AI-driven networks, but also its commitment to Open RAN principles of vendor diversity – with some notable absences from the named technology partners.

Photo by <a href="https://unsplash.com/@bookcrafters?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Joe McDaniel</a> on <a href="https://unsplash.com/s/photos/orange-cloud?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a> “Secondly, this sort of project is vital not only in giving operators like Orange a clearer understanding of how Open RAN solutions interoperate but also how cloud-based, software-defined, intelligent networking transform the operator itself, in terms of its own operations. This is a 5G Standalone network, considered by many to be the ‘true version’ of 5G in which a richer services environment can be fully realised. Although this experimental network is small-scale to start with, it could provide proof points for enhanced user experiences.”

Start-up Augtera Networks has raised $13 million, in a round led by Intel Capital, and is backed by executives from firms including Cisco, Juniper and Gainspeed. This brings its total funding to $18 million: other investors include Bain Capital Ventures, Dell Technologies Capital and Acrew Capital. Augtera uses AI, including machine learning, technologies to detect and fix networking troubles. It is already in use by Orange and Colt Technologies, and Dell and Netone are selling the solution. Augtera claims its Network AI can prevent 40% to 50% of network incidents, has 90% faster detection of critical issues and 50% to 60% reduction in resolution times.

Earlier in June, Nokia announced it would supply a 10Gbps infrastructure for DELTA Fiber in the Netherlands, where the deployment of FTTH is progressing rapidly. The contract covers both the network and customer premise equipment. The network can be upgraded to run 25Gbps when required. DELTA will manage the network using Nokia’s Altiplano Access Controller which enables network automation, faster innovation and simplified operations through software-defined access network (SDAN) solutions. The deployment will combine Nokia’s SDAN technology and developer ecosystem with Microsoft Azure’s cloud-based services.

Rakuten Mobile’s DNA – template, replicate, automate, accelerate

Rahul Atri, Managing Director, Rakuten Mobile Singapore, Head of Products & Engineering, Rakuten Communications Platform (RCP), has “worked for two or three greenfield companies” including Reliance Jio, which totally disrupted the Indian market, and various vendors.

Now at Rakuten Mobile, he works for the most disruptive greenfield of all and some think the business model behind his particular area of responsibility, the RCP, is more revolutionary than the network build. Atri says Rakuten Mobile’s network is foundational to RCP’s success .

How do you go about building a communication infrastructure unlike any other in the world? Atri says that although Rakuten Mobile sees itself very much as an IT company rather than a telco, it faced considerable challenges. Hence, “when we began, we focused primarily on the basic components – people, process then technology. Automation for us has been a necessity and part of the culture from the start,” he says.

Most of his team comes from a software engineering background. Even so, “We invested a lot in people to build the entrepreneurship, mindset and DNA to think differently, not as typical telcos. Then we focused a lot on processes to define how we go about ‘solutioning’ anything, and how to make the process more digital. We didn’t start with operations on day one,” he explains.

Fundamental to automation

The team soon realised that to automate everything demand precision: “For instance, if you want to auto-commission the RAN, you need to be very sure which server on which radio site, in what location and with what serial number is out there and that you want to push your configuration on,” Atri says.

He claims this is in contrast to operators’ often piecemeal approach to automation: they deploy everything then scramble to figure out exactly which inventory is where and what configuration is needed.

In the interests of speed, Rakuten mobile was designing, building and running the network at the same time. Atri explains, “We realised the whole idea of automation is not about integrating a couple of systems with APIs north and southbound, but to manage the lifecycle from scratch; to do the operational day to day.

“Automation was absolute necessity because we had to launch this network and that’s how we were able to do that in one and a half years. We launched 200 to 300 sites live on air every day. We auto-configured 20 to 30 cloud edges clouds every day, and then operations, especially in this COVID situation, became a little easier for us because we invested heavily in automation.”

He states, “For us there was no playbook, no cheat code and no tools to configure the cloud-native network, and that’s…why I mentioned people and processes before the technology.”

Atri says, “[The team] went into details of each and every call flow each and every integration. There’s not a single member in my team who doesn’t understand that when a OSS system talks to a MANO system, and a MANO system talks to a cloud, what parameters are exchanged between them. So we went into that level of detail and that’s how we kind of templatized everything, and this is the platform that we are now taking into the global arena”.

Open to the world

Rakuten Mobile is making its architecture and best practices available to operators round the world through the RCP to help them replicate Rakuten Mobile’s model and success. Rakuten Mobile will not discuss customers and prospective customers at the moment, but the RCP allows customers to visit an online marketplace where they can purchase and deploy everything they want to run their private, cloud-native, virtualised 5G mobile network, regardless of where in the world they are based.

Atri elaborates: “After the infrastructure, you have the cloud and you can deploy Kubernetes or OpenStack, whichever version [of them] you want, and on top of that you have the applications and orchestration. You can design the application and manage the lifecycle.

“The BSS/OSS layer runs on top of that. OSS is more than the ordinary OSS: we offer 45 systems…You can [choose and] register your vendor digitally, register their material hardware/software services, then you can do procurement – you can raise RFPs and track them.

“You have a warehouse services, inventory management inventory across hardware and the physical-logical topology. You then you have the automation of configuration management, fault management, performance management and I could keep talking for another 45 minutes…”.

To get to this point with the RCP Atri says the team “burned through a lot of nights focusing on how to build templates and standard interfaces, so now when they meet potential operator customers the RCP knows what to ask. It doesn’t need to speak to their vendors and the usual things, it just asks them to fill in the template and we know how to integrate the application from there.”

Atri says, “With the success of Rakuten in Japan, people are getting more open, they talk to us a lot more…Coming to the business side, we really think RCP is the pivotal point where the ecosystem and the industry will change.

“I personally love our network in Japan because it’s our baby, we built it from scratch and whatever technology we are taking to the world has been tried and tested in Japan”.

Rakuten Mobile runs Open RAN and virtual RAN (vRAN) commercially to carry 4 million users’ traffic on its end-to-end cloud network. He states, “[We] are managing the first realistic CI/CD [continuous integrated/continuous deployment] pipeline of automation where you have auto-rollbacks and upgrades when you’re talking about the [biggest] number of edge clouds for a telecom network anywhere in the world, where you’re talking about having shifted everything from bare metal to a cloud-native approach on VNFs.”

Embracing the edge

While many telcos are shying away from or taking cautious steps towards the edge, Atri says that from the earliest stages of network design Rakuten planned to deploy regional data centres with “media services, caches, storage, and all the end services users want. It was always about customer experience, right from the start. It was about video, super low latency, and really superior service: being more reliable – all that stuff.”

He thinks the challenge for operators is that they are used to having perhaps three data centres, not maintaining and deploying thousands of them. Atri explains, “For us it comes to replicating templates. We have different regional data centres [that] are all same size, the same capacity and same type of deployment.

“On the far edge, there are five or eight or 10 different types and sizes of edge data centres,” which are templated in terms of the number of racks, how many sites are terminating at the same edge, which version of cloud is involved, how many configurations, the cluster and port size, the number of IP addresses and so on.

Equipment is tracked from the warehouse to deployment where the field engineer photographs each piece of equipment’s QR code to get the serial number and sends it back to base to provide a digital record. The equipment is allocated an IP address, and “I’m ready to do my upgrades or deploy cloud or anything,” Atri says. Now for us, rolling out the feature and services is much faster.

Do standards help or hinder?

Atri says in principle standards are good, but they can hinder innovation. This is particularly true when it comes to complex lifecycle management, like self-healing because while in a data centre there is enough spare capacity to shift applications onto another server if there is a problem, it is not always possible at the edge, especially when the issue is power related or where it co-exists with other facilities

Atri continues, “Also you have hardware and cloud to monitor, and virtual machines or containers, and vRAN. Self-healing is super easy where you only need to take care of one layer, but sometimes I have to correlate…up to the fourth layer if my KPIs aren’t right. These are practical examples and challenges, and I don’t see any practical standards that are mature enough.”

“On a scale of one to 10, I’d love to be standards compliant on eight, but still have the bandwidth to innovate”.

Raising the bar

Rakuten Mobile can carry out certain auto-upgrades across its whole network in eight minutes. While devising the solution to achieve this, Atri says “We thought, ‘This can be a product where you can schedule things, they can auto-upgrade…and that’s how we think in here about automation and use cases.

“There’s a long, long journey we think about…to let the network run on its own,” but nothing daunted he adds, “The bar keeps going up. People were so happy with auto-configuration, then they want self-healing, then AI in the network to tell them everything that is happening, so we are pushing the bar every time, but there is still a lot to do.”

He concludes, “It will be very interesting to see [the impact of] AI and machine learning as we are only two or three years old as a network and there is a lot of network-related data that we are working on for other use cases such as energy saving – we are looking at how to save 30 or 40% – and customer satisfaction and customer rating…they are very important –we are spending a lot of effort and building a lot of platforms there.”

Analytics-Driven Automation Is Critical for Mobile Network Operators to Master 5G Complexity at Scale

By Andrew Colby, Head of 5G Strategy and Product Management, Guavus. 

Operators that incorporate 3GPP-compliant data analytics into their networks from the outset can scale out and manage 5G deployments cost-effectively.

5G networks offer the promise of transforming wireless experiences with low-latency, gigabit data speeds delivered with ultra-reliability. However, these advances come at a price. Mobile network operators (MNOs) face unprecedented complexity as they begin to scale their 5G networks – complexity that can spiral beyond the abilities of humans using existing tools and only semi-automated workflows.

MNOs applying traditional processes to 5G network management will face challenges that place the economic returns of massive 5G investments at risk. However, those that integrate the Third-Generation Partnership Project’s (3GPP’s) Network Data Analytics Function (NWDAF) into their 5G Core can master the complexity of 5G by applying analytics-driven machine intelligence to network automation and service orchestration.

The 3GPP has defined a 5G Service-Based Architecture (SBA) that relies on network data analytics to continuously monitor network state across the 5G RAN/Core infrastructure, analyze this data in real time, and deliver statistical and predictive analytics outputs to Network Functions (NFs) and Application Functions (AFs) – which sit above the network layer – that these functions will consume to automate 5G network and service operations.

 

5G Services Go Cloud Native

Standalone 5G Core networks will operate on cloud-native platforms using the same technologies that powerhouses like Amazon, Microsoft, and Google use to deliver cloud-based services at massive scale. Cloud-native infrastructure is disaggregated, virtualized and software-defined, enabling MNOs to rapidly conceive, develop and deploy a new generation of 5G services by employing the same state-of-the-art DevOps methodologies that IT organizations have adopted to manage the operations of today’s digital service providers. Real-time data analytics is a critical component of the DevOps CI/CD (Continuous Integration/Continuous Deployment) lifecyle that delivers constant feedback about operational state that is utilized to drive (automatically or manually) any system changes or modifications needed to ensure that service operations meets SLAs.

 

4G/LTE Tools and Practices Not Well-Suited for 5G

Andrew Colby is Head of 5G Strategy and Product Management at Guavus, a pioneer in AI-based analytics for communications service providers.

5G RAN/Core infrastructure will support highly dense networks with thousands of small cells packed into relatively small geographic areas and massive scale IoT connectivity delivering 5G service to millions of smart devices. In addition to scale, 5G networks will be far more dynamic than 4G/LTE networks, and so it will be even more critical to continuously monitor the state of the network and the behavior of connected devices to track network performance, detect faults and take any actions required to ensure service continuity.

Existing OSS tools and operational practices were not conceived with these considerations in mind. While 4G/LTE operators have had some success employing analytics and machine learning in existing networks, these efforts have required a customized “bolt-on” approach, due to a lack of standards for data analytics and a system architecture that was optimized for network operations centers staffed by humans. With 5G, the 3GPP is ensuring that analytics is not an afterthought, and that analytics-driven automation is built into the system architecture.

 

Data Analytics Standards Facilitate 5G Multi-Vendor Interoperability

The 3GPP’s 5G data analytics standards are also critical to facilitate the integration of 5G RAN/Core system components from multiple vendors. For example, NWDAF specifies standard data types, formats, data collection APIs and standard data outputs and APIs for analytics processing. This is important because 5G cloudification and the scope of 5G services to be deployed will foster the growth of a 5G supplier ecosystem which will be far more diverse than the existing 4G/LTE ecosystem, which today is dominated by a relatively small number of suppliers.

 

NWDAF: 3GPP Standard for Network Data Analytics

NWDAF represents the mobile industry’s first attempt to standardize the function of analytics in the mobile core network. NWDAF incorporates standard interfaces for collecting different types of data from certain 5G Core NFs and applies the results of analytics processing to inform the operation of other NFs, applying machine intelligence to network automation and service orchestration.

A key problem that NWDAF addresses is data normalization across dissimilar interfaces and data formats in multi-vendor networks. This problem has historically made data collection, aggregation, integration, and analysis from different suppliers’ equipment difficult and time-consuming, resulting in a problematic ROI for many data analytics projects. That situation is now changed. A key part of solving the data normalization problem also involves specifying data semantics for the statistical and predictive analytics outputs that are delivered to NFs and AFs in the 5G Core.

Operationally, NWDAF runs as an NF in the 5G Core network, and once it is deployed, NWDAF registers its Analytics IDs with the Network Repository Function (NRF) – a centralized repository for all of the 5G  NFs – and discovers all NFs it needs to communicate with. From there, OAM systems can gather operational intelligence quickly and easily from NWDAF. The data output by NWDAF is easily utilized by the consumer NFs in the 5G Core, or by 5G Management Data Analtyic Functions (MDAFs) that can also integrate data from an array of other sources spanning the 5G Core, 5G RAN, external data networks, the underlying transport network, and the mobile edge network (see figure). All this takes place without the need for any data reformatting or normalization.

Today, most networks billed as “5G” comply with the 3GPP’s 5G New Radio (NR) standards for the RF portions of the network but with the radios connected via a 4G/LTE core. While this is an expedient way to quickly build out 5G NR sites, MNOs won’t realize the full business potential of 5G until they are operating in 5G Standalone (SA) mode, with the radios connected via a 5G Core. The 3GPP has designed the new, cloud-native 5G Core has to support the network-wide data collection and analytics processing needed to derive the critical statistical and predictive insights that will power 5G network automation and service orchesteration.

 

NWDAF Defines Standard Use Cases

In addition to standard data inputs, outputs and APIs, the 3GPP has also defined a set of standard NWDAF use cases, which specify the source NFs, the statistical or predictive analytics outputs, and the consuming NFs for each particular use case. The following are several examples. For a complete list of standard NWDAF use cases in 3GPP Release 16, see the sidebar, “3GPP-Defined NWDAF Use Cases.”

 

  • Network slicing. This is a capability in 5G networks that’s analogous to virtual LANs (VLANs) in IP networks, creating logical segmentation between end points or applications over a common physical network infrastructure. In a 5G environment with tens or hundreds of network slices, it will be challenging to determine which network slice can provide the best service to a given device. To address this, one use case defined for NWDAF is identifying and predicting the load per individual NF and for each network slice instance. The Network Slice Selection Function (NSSF) can use this information from the NWDAF to help it determine to which network slice a newly registered device (also known as “UE,” or “user equipment”) should be assigned.

 

  • Session load balancing. Similarly, an Access and Mobility Management Function (AMF), which manages subscribers’ access to the network, might request specific intelligence on the load level of several Session Management Function (SMF) instances in order to assign a UE to the SMF best able to serve it.

 

  • Policy decision making. A policy check allows the PCF to apply appropriate policies to the device connection based on the current state of the network. The PCF can use information from the NWDAF for the observed service experience of a device to determine if the application SLA is being satisfied, and if not, what QoS parameters should be applied for the service.

 

Don’t Defer NWDAF Until Later

MNOs migrating networks from 4G/LTE to 5G face unprecedented changes in scale and complexity that will impact their ability to meet operating cost and performance targets employing traditional operations tools and processes. It is imperative that MNOs factor network automation and service orchestration into their plans from the outset, which will require adopting the 3GPP’s standards-based approach to 5G network data analytics.

MNOs can’t afford to defer NWDAF and 5G data analytics until “later” because without NWDAF they lose the ability to automate network operations using the mechanisms designed into the 5G Core’s SBA. The NWDAF standard can be implemented and deployed now, and as described above, it also helps facilitate the integration of multi-vendor 5G RAN/Core infrastructure. MNOs that wait to deploy 5G data analytics until after building out their 5G network risk seeing the cost of 5G operations quickly spiral out of control as a function of scale. NWDAF is a critical, non-optional 5G Core NF that operators need to build into their network from the start.

 

Management Data Analytics and RAN Analytics

The 3GPP 5G SBA also defines Management Data Analytics Functions (MDAFs) which support the collection and analysis of OAM data for a broad range of management capabilities, including automated service assurance, fault management, performance management, and provisioning. The intent is to define a standard “form factor” that will streamline the development OAM data analytics in 5G systems, which will be particularly helpful in multi-vendor environments.

Real-time analytics will be critical for automating the operation of highly complex 5G NR networks in the baseband processing and RF domains. An example of a 5G standard in this area is the Open RAN Alliance’s specification for the RAN Intelligent Controller. While this specification has been developed outside of the 3GPP, adoption of this standard will be critical for the successful operation of multi-vendor 5G Open RAN deployments.

 


3GPP-Defined NWDAF Use Cases

The 3GPP has defined 10 specific NWDAF analytics use cases in Release 16 of the NWDAF standard:

  • Network slice instance load level computation and prediction
  • Service experience computation and prediction for an application or UE group
  • Load analytics information and prediction for a specific NF
  • Application service experience computation and prediction
  • Network performance computation and prediction
  • UE mobility analytics and expected behavior prediction
  • UE abnormal behavior/anomaly detection
  • UE communication analytics and pattern prediction
  • Congestion information – current and predicted for a specific location
  • Quality of service (QoS) sustainability – reporting and predicting QoS change

 

[author bio]

Andrew Colby is Head of 5G Strategy and Product Management at Guavus, a pioneer in AI-based analytics for communications service providers. As a member of the Guavus Office of the CTO, Andrew leads initiatives with customers to identify ways to apply analytics to improve and transform their operations and customer experience. He has worked in the areas of telecom and IP networking, operational support systems, and data analytics, for more than 30 years.

Photo by Joshua Sortino on Unsplash