Sweta Bhardwaj | Dell Technologies Info Hub

Redefining Telecom with Dell AI: The Future Starts with Dell AI for Telecom

Sweta Bhardwaj John Rampart Pratik Sarkar

Wed, 02 Oct 2024 10:52:20 -0000

Read Time: 0 minutes

We are entering a transformational phase with AI and Gen AI. In a recent survey of the state of AI, 65% of organizations have confirmed “regularly using Gen AI” with over 50% of those organizations using Gen AI in at least two business functions[1]. It is estimated that by 2025, two thirds of businesses will leverage a combination of generative AI and Retrieval Augmented Generation RAG to power domain-specific self-service knowledge discovery, to improve decision efficacy by 50%.[2] But this transformation is fraught with challenges, such as skills gap, data and model security, and governance, as well as the cost of running these large language models (LLMs). It is also clear that quality data is a differentiator for Gen AI effectiveness. Any AI model is only as good as the data that feeds and trains it. Critical functions include data prep and formatting, mass storage, protection, accessibility, bandwidth, protection, and recovery.

There’s a need to rethink everything as the current data center and IT operating model are not equipped to cope with the speed and scale of AI, nor are they built to capture the full potential of AI. What’s needed is a new type of data center that is purpose-built to meet the specific demands of AI – this is the AI factory.

Just as physical factories fueled the industrial revolution, AI factories will power the AI revolution. Instead of physical goods, the AI factory produces actionable intelligence, fresh content, and new insights. The AI factory is the new paradigm, and infrastructure is the foundation that makes AI possible.

At Dell Technologies World 2024, we announced the Dell AI Factory, which represents Dell’s approach for embracing and implementing AI. This approach entails helping customers accelerate their AI initiatives and addressing their most pressing challenges with a collection of AI technologies, an open ecosystem of partners, validated and integrated solutions, and expert services to achieve AI outcomes faster.

The need for data security, cost savings, and compliance with government and local sovereignty regulations presents a significant opportunity for Telecom providers and their partnerships with enterprise businesses. With more than 50% of enterprise data generated at the edge and huge amounts of network data aggregated in Telco data centers, much of the data that fuels AI and Gen AI outcomes resides on-premises, highlighting the growing importance of secure, localized solutions.

We are launching Dell AI for Telecom: a family of Dell PowerEdge GPU-enabled servers, data management/storage platforms, and networking to address this transformational evolution together with our ecosystem partners.

Dell AI Factory

With AI-optimized infrastructure as the foundation and PowerEdge servers equipped with GPUs from Nvidia, the Dell AI Factory is open and modular allowing customers the ability to right-size the infrastructure and to expand the infrastructure for growth to meet their business needs. The Dell AI Factory is comprehensive, including Dell PowerEdge GPU servers, network switching fabrics, storage devices, Nvidia AI Enterprise software suite, and Dell professional services for a complete AI Factory infrastructure package.

As the processing of AI data moves closer to the data source, Dell has developed an AI Factory portfolio for every use case. Nvidia L4, L40s, H100, H200, and coming soon the Blackwell family of GPUs are available in the Dell AI Factories to address the need for the right use case with the right GPU. Complementing the GPU offering, Dell AI Factory is based on our PowerEdge server infrastructure which ranges from the PowerEdge XE9680 with the H100 and H200 GPUs for larger model training, inferencing, fine tuning environments to the PowerEdge R760xa with the L40S GPUs for mid-sized model training, inferencing, fine tuning, and video processing applications. As we move closer to the far edge, Dell complements the AI Factory offering with our AI Optimized PowerEdge XR8000 Configurations with Nvidia, suitable for network-centric use cases critical for telecom operators.

The XR8000 server platform is a ruggedized, short depth, modular SLED based platform supporting Nvidia L4 GPUs. It is ideal for the remote zero-touch edge locations where the data resides, or for latency-sensitive applications.

Networking and Storage

In the era of Gen AI, the role of networking and storage infrastructure is more critical than ever. Network switching fabrics are an integral component of an AI Factory because extremely low latency, congestion control, and minimal hops are switching fabric attributes required to deliver the best AI/Gen AI experience. There are three main switching fabrics in an AI Factory:

The GPU-to-GPU fabric that connects all GPUs and allows them to communicate during training, inferencing, and tuning of LLMs.
The Storage switching fabric that connects high-performance, scalable, and low latency storage devices to the GPU server platform. Storage is important for a process called “checkpointing” during LLM training. Checkpointing involves storing the current state of a LLM model during training which allows the training process to recover from an interruption and resume training.
The management fabric that includes both In-band (IB) and Out-of-band (OOB) communications.

Historically, InfiniBand has been the dominant networking fabric in large-scale AI clusters, particularly because of its high data transfer rates, congestion control, and ultra-low latency. These attributes make it ideal for high-performance computing (HPC) and GenAI applications.

However, Ethernet is starting to gain traction in these high-performance environments, especially with the efforts of the Ultra Ethernet Consortium (UEC), a group focused on developing an Ethernet stack tailored for AI and HPC workloads. Dell Technologies is a member of UEC, contributing to innovations that will advance AI infrastructures.

In terms of storage, the Dell PowerScale storage platform is certified for optimal performance in Nvidia GPU-based AI Factories. It scales from a few terabytes to many petabytes and is the primary storage platform used in the Dell AI Factory.

Dell Validated Designs

The Dell AI Factory portfolio is based on Dell Validated Designs (DVDs) which simplify AI with engineering-validated solutions that accelerate deployment of AI models faster. Dell customers benefit from a DVD in that precious engineering resources can be spent developing and moving AI models into production rather than dealing with infrastructure, deployment, and monitoring, and system troubleshooting.

AI at Edge:

Artificial Intelligence at the edge is driving a new wave of operational efficiency and innovation for telcos and enterprises. The Dell XR8000, powered by Intel Sapphire Rapids processors and Nvidia L4 GPUs, delivers robust capabilities for edge AI deployments in private networks and enterprises. As part of Dell’s broader AI ecosystem, the XR8000 enables seamless integration into AI factories, whether for large-scale management clusters or smaller, focused deployments. With its ruggedized design and flexibility, the XR8000 is positioned to enhance operations, generate new revenue streams, and provide valuable services in edge computing environments.

Transforming Telecommunications with Dell AI Factory with Nvidia

Use cases are the driving force and often one of the main motivations behind many of the technology investments that an enterprise makes. AI and GenAI, including AI Factories, have unlocked many use cases spanning both Network NW operators and their end customers. AI and GenAI improve the NW performance, improve the productivity of NW operations, and help to meet revenue targets for NW operators. AI and Gen AI also improve the end subscriber experience by delivering personalized customer experiences. In the sections below we will discuss use cases targeting Telcom network operations and efficiency including CoPilot, Network Digital Twins, and enhanced customer experience.

Network Engineer CoPilot with Kinetica

Dell Technologies is partnering with Kinetica to deliver the Network Engineer CoPilot use case, leveraging the power of GenAI to transform troubleshooting in complex 5G Core and RAN environments. These networks produce vast amounts of data, making it difficult for engineers to quickly pinpoint the root causes of failures. Manual troubleshooting is often time-consuming, error-prone, and too slow for real-time network operations.

With Network Engineer CoPilot, fault management is automated using telco-specific large language models (LLMs) that analyze data in real time. The system detects anomalies, predicts potential failures, and recommends corrective actions, enabling a much faster and more accurate automated Root Cause Analysis (RCA) Identification process. This dramatically reduces troubleshooting time, ensuring network issues are resolved efficiently and with minimal downtime.

In addition to reducing troubleshooting time, proactive AI interventions help improve call quality by detecting and resolving issues before they impact users. This results in enhanced customer satisfaction, better network performance, and improved operational efficiency. Dell Technologies supports this solution with its high-performance PowerEdge R760XA, XR8000, and XE9680 GPU-enabled servers, which are specifically designed to handle the demanding computational needs of real-time data analysis in 5G environments.

Network Digital Twins with Synthefy– Predictive Planning and Efficiency

Dell Technologies is partnering with Synthefy to advance the Network Digital Twin use case. Telecom networks generate massive amounts of data from switches, routers, and access points, making real-time analysis complex. Generative AI (GenAI) addresses this challenge by creating Network Digital Twins—virtual models that replicate both physical and software-defined components of the network. These digital twins enable telecom operators to simulate network conditions, test upgrades, and predict failures using realistic synthetic data, all without disrupting live operations. This results in more efficient decision-making and improved network optimization.

Synthefy provides the sophisticated AI Engine software for the digital twins, while Dell’s advanced GPU-enabled servers, such as the XE9680, and scalable storage solutions like Dell PowerScale, offer the robust infrastructure required to support these AI-driven simulations. Dell’s infrastructure excels in handling large and diverse datasets, including the time series, logs, and PCAP files that are essential for high-performance simulations. By integrating Synthefy software with Dell’s infrastructure, we simplify the creation of digital twins, reduce operational costs, and enhance network resilience and adaptability to future demands.

Enhanced CSP Operations and Customer Engagement with Amdocs

Together with Amdocs amAIz GenAI suite and Dell Technologies’ advanced infrastructure with Nvidia GPUs, we are targeting the next-generation customer service and CSP operations. Amdocs' amAIz Copilots seamlessly integrate with telecom systems to deliver industry-specific AI assistants that provide contextual, accurate, and trustworthy responses to customer queries, while also automating complex tasks. This collaboration promises to:

Improve customer service with intelligent AI-driven Copilots
Enhance efficiency through automation of complex tasks
Streamline operations and reduce operational costs

Dell’s PowerEdge XE9680 and R760XA servers, equipped with Nvidia GPUs, offer the necessary computational power for training and deploying large language models. Dell PowerScale storage ensures high-speed access to essential datasets, while Dell Networking solutions guarantee low-latency data transfers. Together, these technologies enable telecom operators to deploy scalable, high-performance GenAI solutions, transforming customer interactions and optimizing service delivery.

Summary and Conclusion:

At Dell Technologies, we are committed to advancing AI and Generative AI with our Telecom partners. Our AI Factory delivers scalable solutions with PowerEdge GPU servers, network switching, and high-performance storage, tailored specifically for efficient, localized AI deployment.

If optimizing your AI infrastructure and enhancing data security are priorities for your organization, consider Dell's AI Factory for your next project. The potential for innovation and efficiency is vast, and our solutions are designed to make it accessible. Get started with Dell Technologies today and explore how our Dell AI Factory can support your journey into the next era of AI-driven telecommunications. To learn more or to begin your integration, contact your Dell account representative or join our partner program.

References:

[1] https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai

[2] IDC FutureScape: Worldwide Artificial Intelligence and Automation 2024 Predictions - Printer-friendly - AP50341323

Your Browser is Out of Date

Assets

Redefining Telecom with Dell AI: The Future Starts with Dell AI for Telecom