Authors:
Sana Tariq, Principal Technology Architect, Cloud, TELUS Communications
Kandan Kathirvel, Group Product Manager, Telecom, Google Cloud
Abdel Ibrahim, Principal Architect, Telecom, Google Cloud
A snapshot of past and present:
Communication service providers (CSPs) have been using virtual networks and private cloud-based deployments for nearly half a decade. Pivoting from physical network functions (PNFs) to virtual network functions (VNFs) was the first phase in this journey to cloud adoption. The transformation to cloud promised to improve operational efficiency by reducing reliance on specialized hardware, reducing costs through improved hardware utilization, and increasing agility through more flexible software delivery.
This first phase of the CSP’s private cloud adoption didn’t improve operational efficiency nor speed deployment of new telecom services. The VNFs required a plethora of custom configurations and network interfaces, making them far more complex than IT-based cloud applications. CSPs learned to customize private clouds to support the VNFs, but most vendors simply repackaged PNFs as VNFs, with the services remaining tightly coupled to the hardware. This squandered the promised virtualization savings, since the tight coupling prevented efficient hardware utilization. VNFs continued to use legacy principles, such as active-standby redundancy, rather than cloud native redundancy models, further contributing to low utilization and limited cost advantage. Furthermore, the complex configurations slowed deployment, overshadowing the improvements in delivery methods, therefore failing to achieve the promised agility benefits.
The second phase of CSPs’ shift to the cloud came with the birth of container network functions (CNFs), which organize network functions in software containers rather than virtual machines. Container microservices allow multiple services to run independently on the same cluster. The benefit of CNFs is directly tied to specific implementations used; only a few CSPs achieved improvements to cost, performance, and agility. The industry cannot repeat the antipatterns that were discovered during the first phase. As the industry shifts from VNF to CNF, we need to disaggregate the tight coupling of software and hardware, and also modernize the network software, incorporating key concepts like multi-tenancy and single PaaS/CaaS methodologies. Unless we accomplish both of these objectives, CNFs cannot fulfill the CSPs’ increasingly ambitious expectations.
Cloud native development requires a different philosophy, where hardware is clearly separate from the workloads. This approach is critical to realizing the promise of cloud transformation. Workloads need to express their needs, and the cloud needs to meet those needs. This separation enables workloads to be relocated dynamically and increases their efficiency. This supports the “fail-fast and recover faster” redundancy model, tight bin-packing of workloads for optimal hardware utilization, and horizontal scaling. To achieve these benefits, CSPs need to both containerize and also modernize network functions. They must embrace the principle of cattle, not pets. A greater shift in culture and mindset is required to accelerate the adoption of agile DevOps methodologies that support the cloud native approach.
Diving deeper into the cloud native approach
Cloud native practices involve not just technology, but also a principled vision and a different mindset based on computer science fundamentals like modularity, abstraction, separation of concerns, resource sharing, and loose coupling. Some attributes and best practices must be governed at the industry level to enable agility in network function and automation. As the Cloud Native Computing Foundation’s (CNCF) charter says, “Cloud native technologies empower organizations to build and run scalable applications in modern, dynamic environments such as public, private, and hybrid clouds”—but adoption of these cloud native principles can be accelerated only when CSPs learn about and ask for them.
The industry increasingly realizes that openness drives innovation and agility. For CSPs to adopt CNFs requires greater evolution and adoption of cloud native best practices. In addition, the industry needs simple and intent-based cloud native automation to deploy and operate multi-vendor CNFs on multi-vendor clouds. Nephio aims to transform the industry ecosystem towards an open, cloud native, and simplified, interoperable, Kubernetes-based, automation. That is only possible through collaboration across all three industry verticals:
- Network function vendors
- Communications service providers
- Hyperscalers
Network function vendors
As stated earlier, simply wrapping a PNF’s software in a container greatly limits the full value of CNFs. To truly unlock this value, we need to consider twelve additional best practices. As outlined in reference [1] and [2], the cloud native principles are well utilized in enterprise workloads. CNFs need to adopt these principles as well. CNFs need to be:
Compatible: A cloud native approach allows applications and workloads to run anywhere. Ideally, CNFs should work with any certified Kubernetes product, even if we need to use container network interface (CNI) plug-ins or other extensions. CNFs should work on any CNI-compatible network that meets their functionality requirements. Network interfaces and CNI plug-ins create hardware dependencies and tie them to specific infrastructure. A wise, but far-reaching, approach would be to develop all networks and I/O acceleration purely in software. Broader adoption of this software separation might take years to mature. In the short term, we can improve the automation to maximize the agility of the applications.
Stateless: Cloud native applications need resiliency to quickly fail and recover elsewhere in the cloud. The legacy approach uses local storage, which makes network functions heavy and slow. To improve workload mobility, we need to store the state in custom resource definitions (CRDs) or a separate database.
Secure: At its most basic, this principle says “CNFs must run unprivileged”; in other words, the CNFs must be deployable on modern cloud platforms without the need for root administrative privileges. In the cloud-native approach, multiple CNFs must be able to run concurrently on the same hardware. During the transition from PNFs to VNFs, applications required root-level access to Linux, as if they were running on dedicated servers. The same trend continued in the transition to CNFs. We need to remove this dependency on root access by redeveloping the application code to use more modern, cloud-native security protocols.
Scalable: Cloud native telco applications need to support horizontal scaling (across multiple machines) and vertical scaling (between different machines sizes). We want to optimize cost and performance by starting with a very small application and growing it as needed. This model yields efficiency and agility.
Configurable: Open configuration offers telcos the control and freedom of DevOps tools to create and manage services. This should happen via custom resource definitions (CRDs) and operators, or other declarative interfaces.
Observable: Cloud native applications need an Open Metrics interface that Prometheus and other monitoring tools can use. Kubernetes needs to access performance metrics that support container-level resiliency features. Analytics applications can process these metrics and suggest configuration changes to improve utilization and performance.
Portable: Applications that can run on multiple clouds are more agile. CNFs must be able to declare their platform requirements without implying a specific implementation. The cloud fulfills those requirements, making network functions oblivious to the underlying cloud offering. This maximizes portability between execution environments.
Installable and Upgradeable: Use of CRDs, operators, and declarative configurations gives flexibility and ease in the deployment and upgrade of CNFs. Automation tooling can track and validate the installation and upgrade processes, with rollbacks supported if needed.
Parity across environments: Cloud native applications need to minimize divergence between development and production, enabling continuous deployment for maximum agility. Telcos can deliver features and upgrades faster by implementing DevOps practices over a CI/CD pipeline.
Open: Cloud native applications need to be orchestrated, run as a service, and expose themselves via RESTful interfaces. Nephio will achieve this goal by enabling third-party automation tools to reconfigure the application and achieve any type of orchestration use case.
Traceable: Cloud native applications need to support real-time troubleshooting through Open APIs with telemetry-compatible tracing.
Loggable: Cloud native applications need to support uniform logging for consistency and access to network-wide logs.
Communication service providers (CSPs)
CSPs, and their customers, expect the highest quality voice and data services. For CSPs to adopt agile and cloud native methodologies, and maximize their benefits, requires a cultural and perspectival shift throughout all levels of the industry.
Hyperscalers
Cloud offers comprehensive services with resiliency, diversified geography, greater agility, advanced analytics, and intelligence of AI/ML. CSPs started using the cloud to run their network, reducing cost and increasing efficiency. Hyperscalers bring the promise of greater agility and scale to meet CSPs’ networking requirements. Adopting these cloud native principles across cloud, network function, and telco services will unlock the efficiency and cost savings implied by cloud transformation.
Unlocking the cloud native network together
This post describes many reasons for telcos to go cloud native. This approach gives telcos greater control over the distribution of network resources, more efficient and powerful network utilization, next-generation customer services through an expanded network edge, and an industry-wide transformation that will usher in a new golden age of network technology.
Collectively, we can solve the challenge of onboarding telco network functions quickly in the Kubernetes world.
Find out more about Nephio and join us now. We welcome your participation!
References
[1] The Twelve Factors App https://12factor.net/
[2] Twelve factors in CNFs https://x.cnf.dev/