CXL 4.0 and the Interconnect Wars: How AI Memory Is Reshaping Data Center Architecture

CXL 4.0 specification released Nov 18 with PCIe 7.0, 128 GT/s, bundled ports. Panmnesia ships first CXL 3.2 fabric switch. UALink, Ultra Ethernet, Huawei UB-Mesh compete.

Blake Crosley

Dec 12, 2025 6 min read Disclaimer

CXL 4.0 and the Interconnect Wars: How AI Memory Is Reshaping Data Center Architecture

December 12, 2025

December 2025 Update: The CXL Consortium released CXL 4.0 on November 18, doubling bandwidth to 128 GT/s with PCIe 7.0 and introducing bundled ports for 1.5 TB/s connections. Panmnesia began sampling the industry's first CXL 3.2 fabric switch with port-based routing. Meanwhile, UALink eyes late 2026 deployment and Huawei open-sourced UB-Mesh as an alternative.

TL;DR

CXL 4.0 represents the next generation of memory interconnect technology, enabling 100+ terabytes of pooled memory with cache coherency across AI infrastructure. The specification's bundled ports feature allows aggregating multiple physical ports into single logical attachments delivering 1.5 TB/s total bandwidth. Panmnesia's CXL 3.2 fabric switch marks the first hardware implementing port-based routing for multi-rack AI clusters. The broader interconnect landscape fragments further as UALink, Ultra Ethernet, and Huawei's UB-Mesh compete for different niches.

What Happened

The CXL Consortium released the Compute Express Link 4.0 specification on November 18, 2025, at SC25.¹ The specification shifts from PCIe 6.x (64 GT/s) to PCIe 7.0 (128 GT/s), doubling available bandwidth while maintaining the 256-byte FLIT format introduced with CXL 3.x.²

"The release of the CXL 4.0 specification sets a new milestone for advancing coherent memory connectivity, doubling the bandwidth over the previous generation with powerful new features," stated Derek Rohde, CXL Consortium President and Principal Engineer at NVIDIA.³

Four days earlier, on November 12, Korean startup Panmnesia announced sample availability of its PCIe 6.0/CXL 3.2 Fabric Switch: the first silicon implementing port-based routing (PBR) for CXL fabrics.⁴

The interconnect landscape continues fragmenting. UALink targets late 2026 data center deployment. Huawei announced it will open-source its UB-Mesh protocol, designed to replace PCIe, CXL, NVLink, and TCP/IP with a unified standard.⁵

Why It Matters for Infrastructure

Memory Becomes Composable: CXL 4.0 enables memory pooling at scale. AI inference workloads requiring hundreds of terabytes can now access shared memory pools across racks with cache coherency, not just within a single server.

Bandwidth Matches AI Demand: A CXL 4.0 bundled port with x16 links at 128 GT/s delivers 768 GB/s in each direction (1.536 TB/s total bandwidth between device and CPU).⁶ LLM inference serving benefits directly from this capacity.

Multi-Rack AI Clusters: The port-based routing in CXL 3.2/4.0 allows fabric switches to interconnect thousands of devices across multiple racks without incurring long network latency. Panmnesia claims "double-digit nanosecond latency" for memory access.⁷

Standards Fragmentation Risk: Four competing interconnect ecosystems (CXL/PCIe, UALink, Ultra Ethernet, NVLink) force infrastructure planners to bet on winners. Equipment purchased today may face interoperability challenges in 2027.

Technical Details

CXL 4.0 Specification

Feature	CXL 3.x	CXL 4.0
Base Protocol	PCIe 6.x	PCIe 7.0
Transfer Speed	64 GT/s	128 GT/s
FLIT Size	256B	256B
Retimers Supported	2	4
Link Width Options	Standard	Native x2 added
Bundled Ports	No	Yes

Bundled Ports Architecture

CXL 4.0's bundled ports aggregate multiple physical CXL device ports into a single logical entity:⁸

Host and Type 1/2 device can combine multiple physical ports
System software sees single device despite multiple physical connections
Optimized for 256B Flit Mode, eliminating legacy 68B Flit overhead
Enables 1.5+ TB/s total bandwidth per logical connection

Panmnesia CXL 3.2 Fabric Switch

The first CXL 3.2 switch silicon includes:⁹

Specification	Detail
Protocol Support	PCIe Gen 6.0 + CXL 3.2 hybrid
Data Rate	64 GT/s
Routing Modes	PBR (port-based) and HBR (hierarchy-based)
CXL Subprotocols	CXL.cache, CXL.mem, CXL.io
Lane Count	256-lane high fan-out
Latency	Double-digit nanoseconds
Backward Compatibility	All previous PCIe/CXL generations

Target applications include DLRM (Deep Learning Recommendation Models), LLM inference, RAG workloads, and MPI-based HPC simulations.

Competing Interconnect Standards

Standard	Owner	Purpose	Bandwidth	Scale	Timeline
CXL 4.0	Consortium	Memory coherency	128 GT/s	Multi-rack	Late 2026-2027
NVLink 5	NVIDIA	GPU-GPU	1.8 TB/s	576 GPUs	Available
UALink 1.0	AMD-led consortium	Accelerator-accelerator	200 Gb/s/lane	1,024 devices	Late 2026
Ultra Ethernet	UEC	Scale-out networking	Ethernet-based	10,000s endpoints	2026+
UB-Mesh	Huawei	Unified interconnect	1+ TB/s/device	1M processors	Open-sourced

Interconnect Decision Framework

When to use which standard:

Use Case	Best Fit	Why
GPU-to-GPU within node	NVLink	Highest bandwidth (1.8 TB/s), lowest latency
GPU-to-GPU across nodes	UALink	Open standard alternative to NVLink
Memory expansion	CXL	Cache coherency with CPU, memory pooling
Scale-out networking	Ultra Ethernet / InfiniBand	Designed for 10,000+ endpoint clusters
Unified China ecosystem	UB-Mesh	Avoids Western IP restrictions

UALink vs. CXL Positioning

UALink does not compete directly with CXL. They serve different purposes:¹⁰

UALink: GPU-to-GPU scaling for accelerator clusters (scale-up)
CXL: CPU-memory coherency and memory pooling (memory expansion)
Ultra Ethernet: Scale-out networking across data centers

"UALink works alongside PCIe and CXL, but only UALink has the effect of unifying the allocated resources. UALink is designed to connect your main GPU units for GPU-to-GPU scaling," explained Michael Posner, VP of Product Management at Synopsys.¹¹

Huawei UB-Mesh

Huawei's alternative approach aims to replace all existing interconnects:¹²

Targets 1 TB/s+ bandwidth per device
~150 ns hop latency (microseconds to nanoseconds improvement)
Synchronous load/store semantics vs. packet-based
Open-source license announced September 2025
Scales to 1 million processors in "SuperNode" architecture

Industry adoption remains uncertain given geopolitical concerns and existing standards momentum.

What's Next

Late 2026: UALink switches reach data centers; CXL 4.0 products begin sampling.

Late 2026-2027: CXL 4.0 multi-rack systems reach production deployment.¹³

Q4 2026: Upscale AI targets UALink switch delivery.¹⁴

Ongoing: Standards bodies navigate coexistence of CXL, UALink, and Ultra Ethernet. Huawei's UB-Mesh seeks adoption outside Western markets.

The interconnect landscape will remain fragmented through at least 2027. No single standard addresses all use cases: memory pooling (CXL), accelerator scaling (UALink/NVLink), and network fabric (Ultra Ethernet/InfiniBand).

Key Takeaways

For infrastructure planners: - CXL 4.0 enables 100+ TB memory pools with cache coherency across racks - Panmnesia sampling first CXL 3.2 fabric switch with port-based routing - Plan for standards coexistence: CXL + UALink + Ultra Ethernet/InfiniBand - Late 2026-2027 deployment timeline for CXL 4.0 production systems

For operations teams: - CXL maintains backward compatibility with previous generations - Port-based routing simplifies multi-rack fabric management - Double-digit nanosecond latency for memory access across switches - Monitor Panmnesia, XConn, and other CXL switch vendors for availability

For strategic planning: - No single interconnect standard will "win" because different layers serve different purposes - Memory pooling becomes viable for AI inference at scale - Huawei's UB-Mesh creates parallel ecosystem primarily for China market - Equipment decisions in 2025-2026 will affect interoperability through 2030

References

For AI infrastructure deployment with advanced interconnect architectures, contact Introl.

CXL Consortium. "CXL Consortium Releases the Compute Express Link 4.0 Specification." November 18, 2025. ↩
VideoCardz. "CXL 4.0 spec moves to PCIe 7.0, doubles bandwidth over CXL 3.0." November 2025. ↩
Business Wire. "CXL Consortium Releases the Compute Express Link 4.0 Specification Increasing Speed and Bandwidth." November 18, 2025. ↩
Business Wire. "Panmnesia Announces Sample Availability of PCIe 6.0/CXL 3.2 Fabric Switch." November 12, 2025. ↩
Tom's Hardware. "Huawei to open-source its UB-Mesh data center-scale interconnect soon." August 2025. ↩
Datacenter.news. "CXL 4.0 doubles bandwidth, introduces bundled ports for data centres." November 2025. ↩
Panmnesia. "Press Release: PCIe 6.0/CXL 3.2 Fabric Switch." November 2025. ↩
Blocks and Files. "CXL 4.0 doubles bandwidth and stretches memory pooling to multi-rack setups." November 24, 2025. ↩
TechPowerUp. "Panmnesia Samples Industry's First PCIe 6.0/CXL 3.2 Fabric Switch." November 2025. ↩
Semi Engineering. "New Data Center Protocols Tackle AI." 2025. ↩
Synopsys. "Ultra Ethernet UaLink AI Networks." 2025. ↩
ServeTheHome. "Huawei Presents UB-Mesh Interconnect for Large AI SuperNodes at Hot Chips 2025." August 2025. ↩
Blocks and Files. "CXL 4.0 doubles bandwidth." November 2025. ↩
HPCwire. "Upscale AI Eyes Late 2026 for Scale-Up UALink Switch." December 2, 2025. ↩
EE Times. "CXL Adds Port Bundling to Quench AI Thirst." November 2025. ↩
SDxCentral. "Compute Express Link Consortium debuts 4.0 spec to push past bandwidth bottlenecks." November 2025. ↩
CXL Consortium. "CXL 4.0 White Paper." November 2025. ↩