CXL 4.0 Infrastructure Planning Guide: Memory Pooling for AI at Scale

Complete CXL 4.0 deployment guide covering bundled ports, multi-rack memory pooling, KV cache offloading, vendor ecosystem, and 2026-2027 planning timeline.

CXL 4.0 Infrastructure Planning Guide: Memory Pooling for AI at Scale

December 13, 2025

December 2025 Update: The CXL Consortium released CXL 4.0 on November 18, 2025, doubling bandwidth to 128 GT/s via PCIe 7.0 and introducing bundled ports for 1.5 TB/s connections. This guide covers deployment planning for organizations preparing to implement CXL-based memory pooling in their AI infrastructure.


TL;DR

CXL 4.0 enables memory pooling at unprecedented scale, allowing AI inference workloads to access 100+ terabytes of shared memory with cache coherency across multiple racks. The specification's bundled ports aggregate multiple physical connections into single logical attachments delivering 1.5 TB/s bandwidth. For infrastructure planners, the key decisions involve understanding when to adopt CXL (2026-2027 for production), which products to evaluate now (CXL 2.0/3.0 switches shipping), and how CXL complements rather than replaces NVLink and UALink. This guide provides the technical depth and decision frameworks needed to plan CXL deployments.


The Memory Wall Problem

Large language models hit a fundamental constraint: GPU memory capacity. Modern AI inference workloads routinely exceed 80-120 GB per GPU, and the key-value (KV) cache grows with context length.1 A single inference request with a 128K context window can consume tens of gigabytes just for KV cache storage.

The problem intensifies at scale. Model weights for frontier LLMs consume hundreds of gigabytes. KV cache requirements grow linearly with both batch size and sequence length. GPU VRAM remains fixed at 80GB (H100) or 192GB (B200).2

Traditional solutions fall short:

Approach Limitation
Add more GPUs Linear cost increase, memory still isolated per GPU
NVMe offloading ~100 μs latency, 100x slower than DRAM
RDMA-based sharing Still 10-20 μs latency, complex networking
Larger GPU memory Supply-constrained, expensive

CXL changes this equation by enabling memory pooling with DRAM-like latency (200-500 ns) across the data center.3


CXL 4.0 Technical Deep Dive

Evolution from CXL 1.0 to 4.0

CXL has matured rapidly since its 2019 introduction. Each generation expanded capabilities:

Generation Release PCIe Base Speed Key Advancement
CXL 1.0/1.1 2019/2020 PCIe 5.0 32 GT/s Basic coherent memory attach
CXL 2.0 2022 PCIe 5.0 32 GT/s Switching, memory pooling, multi-device
CXL 3.0/3.1 2023/2024 PCIe 6.0 64 GT/s Fabric support, peer-to-peer, 4,096 nodes
CXL 4.0 Nov 2025 PCIe 7.0 128 GT/s Bundled ports, multi-rack, enhanced RAS

CXL 2.0 introduced the foundational concept of memory pooling. Multiple Type 3 memory devices connect to a switch, forming a shared pool from which the switch dynamically allocates resources to different hosts.4 This enables memory utilization improvements from typical 50-60% to 85%+ across a cluster.

CXL 3.0 added fabric capabilities supporting multi-level switching and up to 4,096 nodes with port-based routing (PBR).5 The shift to 256-byte FLITs and PCIe 6.0's 64 GT/s doubled available bandwidth.

CXL 4.0 doubles bandwidth again while introducing features critical for multi-rack AI deployments.

Bundled Ports Architecture

CXL 4.0's most significant feature for high-performance computing: bundled ports aggregate multiple physical CXL device ports into a single logical entity.6

How bundled ports work:

  1. A host and Type 1/2 device combine multiple physical ports
  2. System software sees a single device despite multiple physical connections
  3. Bandwidth aggregates across all bundled ports
  4. Optimized for 256-byte FLIT mode, eliminating legacy overhead

Bandwidth calculations:

Configuration Direction Bandwidth
Single x16 port @ 128 GT/s Unidirectional 256 GB/s
Single x16 port @ 128 GT/s Bidirectional 512 GB/s
3 bundled x16 ports @ 128 GT/s Unidirectional 768 GB/s
3 bundled x16 ports @ 128 GT/s Bidirectional 1,536 GB/s

For context, HBM3e memory on an H200 delivers 4.8 TB/s bandwidth.7 A bundled CXL 4.0 connection at 1.5 TB/s represents approximately 30% of that bandwidth—sufficient for many memory expansion use cases where capacity matters more than peak bandwidth.

PCIe 7.0 Foundation

CXL 4.0 builds on PCIe 7.0's physical layer improvements:8

  • 128 GT/s transfer rate: Double the 64 GT/s of PCIe 6.0
  • PAM4 signaling: Same encoding scheme as PCIe 6.0
  • Improved FEC: Forward error correction for signal integrity
  • Optical support: Enables longer reach connections

The specification retains the 256-byte FLIT format from CXL 3.x while adding a latency-optimized variant for time-sensitive operations.9

Multi-Rack Fabric Capabilities

CXL 4.0 extends reach through two mechanisms:

Four retimers supported: Previous generations allowed two retimers. Four retimers enable longer physical connections spanning multiple racks without signal degradation.10

Native x2 width: Previously a degraded fallback mode, x2 links now operate at full performance. This enables higher fan-out configurations where many lower-bandwidth connections serve more endpoints.11

These features combine to enable "multi-rack memory pooling"—a capability the CXL Consortium explicitly targets for late 2026-2027 production deployment.12


CXL Use Cases for AI Infrastructure

KV Cache Offloading for LLM Inference

The highest-impact near-term use case: offloading KV cache from GPU VRAM to CXL-attached memory.

The problem: LLM inference with long contexts generates massive KV caches. A 70B parameter model with 128K context and batch size 32 can require 150+ GB just for KV cache.13 This exceeds H100 VRAM, forcing expensive batch size reductions or multiple GPUs.

The CXL solution: Store KV cache in pooled CXL memory while keeping hot layers in GPU VRAM. XConn and MemVerge demonstrated this at SC25 and OCP 2025:14

  • Two H100 GPUs (80GB each) running OPT-6.7B
  • KV cache offloaded to shared CXL memory pool
  • 3.8x speedup vs 200G RDMA
  • 6.5x speedup vs 100G RDMA
  • >5x improvement vs SSD-based KV cache

Research from academia confirms the opportunity. PNM-KV (Processing-Near-Memory for KV cache) achieves up to 21.9x throughput improvement by offloading token page selection to accelerators within CXL memory.15

Memory Expansion for Training

Training workloads benefit from expanded memory capacity for:

  • Larger batch sizes: More samples per iteration without gradient accumulation
  • Activation checkpointing reduction: Store more activations in memory vs recomputation
  • Optimizer state: Adam optimizer requires 2x parameters for momentum/variance

CXL memory expansion enables training configurations previously requiring multi-node distribution to run on single nodes, reducing communication overhead.

Scientific and HPC Workloads

PNNL's Crete project uses CXL pools for high-throughput memory sharing across compute nodes in scientific simulations.16 Use cases include:

  • Molecular dynamics with large neighbor lists
  • Graph analytics on trillion-edge datasets
  • In-memory databases exceeding single-server capacity

The Interconnect Landscape

Understanding where CXL fits requires recognizing that these technologies serve different purposes:

Standard Primary Purpose Best For
CXL Memory coherency + pooling CPU-memory expansion, shared memory pools
NVLink GPU-to-GPU scaling Within-node GPU communication
UALink Accelerator interconnect Open standard alternative to NVLink
Ultra Ethernet Scale-out networking Multi-rack, 10,000+ endpoints

CXL runs on PCIe SerDes: lower error rate, lower latency, but lower bandwidth than NVLink/UALink's Ethernet-style SerDes.17 NVLink 5 delivers 1.8 TB/s per GPU—far exceeding CXL 4.0's 512 GB/s per x16 port.18

The technologies complement rather than compete:

  • Within a GPU node: NVLink connects GPUs
  • Between nodes: UALink or InfiniBand/Ethernet
  • Memory expansion: CXL adds capacity to CPUs and accelerators
  • Fabric-wide memory pools: CXL switches enable sharing across hosts

Panmnesia proposes "CXL-over-XLink" architectures integrating all three, reporting 5.3x faster AI training and 6x inference latency reduction vs PCIe/RDMA baselines.19

Decision Framework: When to Use What

Scenario Recommended Interconnect Rationale
Multi-GPU training within server NVLink Highest bandwidth, lowest latency
Multi-GPU inference pod (non-NVIDIA) UALink Open standard, high bandwidth
Expand memory beyond VRAM CXL Cache coherency, DRAM-like latency
Multi-rack GPU cluster InfiniBand or Ultra Ethernet Designed for scale-out
Shared memory pool across servers CXL switches Memory pooling with coherency
China/restricted markets Consider UB-Mesh Avoids Western IP dependencies

CXL Ecosystem: Vendors and Products

Memory Expanders

The three major DRAM manufacturers all ship CXL memory expanders:

Vendor Product Capacity Interface Status
Samsung CMM-D 256 GB CXL 2.0 Mass production 202520
SK Hynix CMM-DDR5 128 GB CXL 2.0 Mass production late 202421
Micron CZ120 256 GB CXL 2.0 Sampling22
SK Hynix CMS 512 GB CXL (compute-enabled) Announced23

SK Hynix's CMS (Computational Memory Solution) adds compute capabilities directly in the memory module—an early implementation of processing-near-memory for CXL.

Switch Vendors

CXL switches enable memory pooling across multiple hosts:

Vendor Product Generation Status Key Feature
XConn XC50256 CXL 2.0 Shipping 256-lane switch, first to market24
XConn Apollo CXL 2.0 Shipping Memory pooling demonstrations at SC2525
Panmnesia Fabric Switch CXL 3.2 Sampling Nov 2025 First PBR implementation26
Astera Labs Leo CXL 2.0 Shipping Smart memory controller27
Microchip SMC 2000 CXL 2.0 Shipping Memory expansion controller28

Panmnesia's CXL 3.2 Fabric Switch represents a generation leap: first silicon implementing port-based routing for true fabric architectures with up to 4,096 nodes.29

Controller Vendors

CXL memory controllers translate between CXL protocol and DRAM:

Vendor Role Key Products
Marvell Controller Structera CXL controllers30
Montage Controller CXL memory buffer chips
Astera Labs Controller Leo smart memory controller
Microchip Controller SMC 2000 series

Marvell's Structera completed interoperability testing with all three major memory suppliers (Samsung, Micron, SK Hynix) on both Intel and AMD platforms.31


Deployment Planning Guide

Timeline

Period CXL Generation Expected Capability Recommendation
Now-Q2 2026 CXL 2.0 Memory expansion, basic pooling Production evaluation
Q3 2026-Q4 2026 CXL 3.0/3.1 Fabric, peer-to-peer, 4K nodes Early adoption for AI
2027+ CXL 4.0 Multi-rack pooling, 1.5 TB/s Planning begins now

ABI Research expects CXL 3.0/3.1 solutions with sufficient software support for commercial adoption by 2027.32

What to Evaluate Now

Immediate (2025): 1. Test CXL 2.0 memory expanders on existing Intel Sapphire Rapids or AMD EPYC Genoa servers 2. Evaluate XConn or Astera Labs switches for memory pooling proofs-of-concept 3. Benchmark KV cache offloading with MemVerge GISMO technology33

2026 planning: 1. Assess Panmnesia CXL 3.2 switch samples when available 2. Plan for CXL-enabled GPU servers (NVIDIA Blackwell supports CXL)34 3. Develop software stack for memory tiering between GPU VRAM and CXL memory

Infrastructure Readiness Checklist

Before deploying CXL:

  • [ ] CPU platform supports CXL (Intel 4th Gen+ or AMD EPYC 4th Gen+)
  • [ ] PCIe slots available for CXL devices
  • [ ] Operating system with CXL support (Linux 6.0+ for basic, 6.8+ for hotplug)35
  • [ ] Application software can tier memory (NUMA-aware or explicit APIs)
  • [ ] Monitoring for CXL RAS events and memory errors

Software Stack Considerations

CXL memory appears as additional NUMA nodes to the operating system. Applications must be CXL-aware or NUMA-aware to benefit:

Software Layer Requirement Solutions
OS kernel CXL driver support Linux 6.0+, Windows Server 2025
Memory allocator NUMA-aware allocation numactl, hwloc, libmemkind
Application Memory tiering Explicit placement or transparent tiering
AI framework KV cache offloading vLLM CXL support (in development)36

MemVerge's Memory Machine provides transparent memory tiering for applications without CXL-specific code paths.37

Cost Considerations

CXL memory costs less per GB than GPU VRAM:

Memory Type Approximate Cost/GB Latency
HBM3e (in GPU) $15-25 ~10 ns
DDR5 DIMM $3-5 ~80 ns
CXL-attached DDR5 $4-7 200-500 ns
NVMe SSD $0.10-0.20 ~100 μs

For KV cache that tolerates 200-500 ns latency, CXL offers 4-5x cost reduction vs keeping data in GPU VRAM while delivering 200-500x lower latency than SSD offloading.38

Migration Strategy

Phase 1: Memory expansion (Now) - Deploy CXL memory expanders for capacity - No application changes required - OS treats CXL memory as slow NUMA node

Phase 2: Memory tiering (2025-2026) - Implement tiering between fast local memory and CXL - Hot data in local DRAM, cold data in CXL memory - Requires memory management software

Phase 3: Memory pooling (2026-2027) - Deploy CXL switches for shared memory pools - Multiple hosts access common memory resources - Enables disaggregated memory architecture


Integration with Existing Infrastructure

NVIDIA GPU Systems

NVIDIA supports CXL on Blackwell architecture. The Grace CPU in Grace Hopper systems includes CXL support, enabling memory expansion beyond the 480GB unified memory.39

For H100 systems, CXL memory expanders attach to the host CPU, not directly to GPUs. KV cache offloading requires CPU involvement to copy data between GPU VRAM and CXL memory.

AMD GPU Systems

AMD MI300X includes CXL support through its CPU chiplet. The 192GB unified memory can be supplemented with CXL-attached capacity.40

Intel GPU Systems

Intel Data Center GPU Max (Ponte Vecchio) supports CXL for memory expansion on systems with Sapphire Rapids or later CPUs.41


Risks and Considerations

Standards Fragmentation

Four competing interconnect ecosystems (CXL/PCIe, UALink, Ultra Ethernet, NVLink) force infrastructure planners to make bets. Equipment purchased today may face interoperability challenges in 2027.

Mitigation: CXL's PCIe foundation ensures backward compatibility. CXL 4.0 devices will work with CXL 3.x, 2.0, 1.1, and 1.0 systems at reduced capability.42

Software Maturity

CXL software support remains early-stage. Linux kernel support exists but many applications lack CXL-specific optimizations.

Mitigation: Use memory tiering software like MemVerge Memory Machine that provides transparent CXL integration.

Supply Chain

CXL 4.0 products won't reach volume production until 2027. CXL 3.0/3.1 availability depends on PCIe 6.0 ecosystem maturity.

Mitigation: Begin with CXL 2.0 products available today. Build operational experience before next-generation availability.


Key Takeaways

For infrastructure planners: - CXL 4.0 enables 100+ TB memory pools with cache coherency across racks - Bundled ports deliver 1.5 TB/s bandwidth per logical connection - Production deployment timeline: CXL 2.0 now, CXL 3.x late 2026, CXL 4.0 2027+ - Start evaluating CXL 2.0 memory expanders and switches immediately

For AI platform teams: - KV cache offloading to CXL memory delivers 3.8-6.5x speedup vs RDMA - CXL memory costs 4-5x less than GPU VRAM while delivering 200-500x lower latency than SSD - vLLM and other frameworks developing CXL-aware KV cache management - Test with XConn/MemVerge demonstrations available now

For strategic planning: - CXL complements NVLink/UALink; they serve different purposes - No single interconnect standard will "win"—plan for coexistence - Equipment decisions in 2025-2026 affect interoperability through 2030 - Chinese alternative (Huawei UB-Mesh) may create parallel ecosystem

For procurement: - CXL 2.0 products shipping from Samsung, SK Hynix, Micron - XConn and Astera Labs switches available for evaluation - Panmnesia CXL 3.2 fabric switch sampling November 2025 - Budget for CXL-enabled servers in 2026-2027 refresh cycles


For AI infrastructure deployment with CXL-enabled memory architecture, contact Introl.


References


  1. Compute Express Link. "Overcoming the AI Memory Wall: How CXL Memory Pooling Powers the Next Leap in Scalable AI Computing." 2025. https://computeexpresslink.org/blog/overcoming-the-ai-memory-wall-how-cxl-memory-pooling-powers-the-next-leap-in-scalable-ai-computing-4267/ 

  2. NVIDIA. "H200 and B200 Specifications." 2025. 

  3. Synopsys. "How CXL and Memory Pooling Reduce HPC Latency." 2025. https://www.synopsys.com/blogs/chip-design/cxl-protocol-memory-pooling.html 

  4. AMI Next Blog. "CXL Deep Dive: From Principles to Memory Pooling." 2025. https://www.aminext.blog/en/post/cxl-compute-express-link-deep-dive-1 

  5. CXL Consortium. "CXL 3.0 Specification." 2023. 

  6. Blocks and Files. "CXL 4.0 doubles bandwidth and stretches memory pooling to multi-rack setups." November 24, 2025. https://blocksandfiles.com/2025/11/24/cxl-4/ 

  7. NVIDIA. "H200 Tensor Core GPU Specifications." 2024. 

  8. VideoCardz. "CXL 4.0 spec moves to PCIe 7.0, doubles bandwidth over CXL 3.0." November 2025. https://videocardz.com/newz/cxl-4-0-spec-moves-to-pcie-7-0-doubles-bandwidth-over-cxl-3-0 

  9. Synopsys. "CXL 4.0, Bandwidth First: What Designers Are Solving for Next." December 2025. https://www.synopsys.com/blogs/chip-design/cxl-4-bandwidth-first-what-designers-are-solving-next.html 

  10. Datacenter News. "CXL 4.0 doubles bandwidth, introduces bundled ports for data centres." November 2025. https://datacenter.news/story/cxl-4-0-doubles-bandwidth-introduces-bundled-ports-for-data-centres 

  11. CXL Consortium. "CXL 4.0 Webinar." December 4, 2025. https://computeexpresslink.org/wp-content/uploads/2025/12/CXL_4.0-Webinar_December-2025_FINAL.pdf 

  12. Business Wire. "CXL Consortium Releases the Compute Express Link 4.0 Specification." November 18, 2025. https://www.businesswire.com/news/home/20251118275848/en/CXL-Consortium-Releases-the-Compute-Express-Link-4.0-Specification-Increasing-Speed-and-Bandwidth 

  13. arXiv. "Scalable Processing-Near-Memory for 1M-Token LLM Inference." November 2025. https://arxiv.org/abs/2511.00321 

  14. PRWeb. "XConn Technologies and MemVerge Demonstrate CXL Memory Pool for KV Cache." October 2025. https://www.prweb.com/releases/xconn-technologies-and-memverge-demonstrate-cxl-memory-pool-for-kv-cache-using-nvidia-dynamo-for-breakthrough-ai-workload-performance-at-2025-ocp-global-summit-302581860.html 

  15. arXiv. "PNM-KV: Scalable Processing-Near-Memory for 1M-Token LLM Inference." 2025. https://arxiv.org/html/2501.09020v1 

  16. CXL Consortium. "How CXL Transforms Server Memory Infrastructure." October 2025. https://computeexpresslink.org/wp-content/uploads/2025/10/CXL_Q3-2025-Webinar_FINAL.pdf 

  17. Blocks and Files. "Panmnesia pushes unified memory and interconnect design for AI superclusters." July 2025. https://blocksandfiles.com/2025/07/18/panmnesia-cxl-over-xlink-ai-supercluster-architecture/ 

  18. Next Platform. "UALink Fires First GPU Interconnect Salvo At Nvidia NVSwitch." April 2025. https://www.nextplatform.com/2025/04/08/ualink-fires-first-gpu-interconnect-salvo-at-nvidia-nvswitch/ 

  19. Business News This Week. "Panmnesia Introduces AI Infrastructure with CXL over NVLink and UALink." July 2025. https://businessnewsthisweek.com/technology/panmnesia-introduces-todays-and-tomorrows-ai-infrastructure-including-a-supercluster-architecture-that-integrates-nvlink-ualink-and-hbm-via-cxl/ 

  20. TrendForce. "SK hynix and Samsung Step up Focus on HBM4 and CXL." October 2024. https://www.trendforce.com/news/2024/10/24/news-sk-hynix-and-samsung-reportedly-step-up-focus-on-hbm4-and-cxl-amid-rising-chinese-competition/ 

  21. ServeTheHome. "SK hynix CXL 2.0 Memory Expansion Modules Launched." 2024. https://www.servethehome.com/sk-hynix-cxl-2-0-memory-expansion-modules-launched-with-96gb-of-ddr5/ 

  22. AnandTech. "CXL Gathers Momentum at FMS 2024." 2024. https://www.anandtech.com/show/21533/cxl-gathers-momentum-at-fms-2024 

  23. Tom's Hardware. "SK Hynix Unveils CXL Memory Module with Compute Capabilities." 2025. https://www.tomshardware.com/news/sk-hynix-unveils-cxl-computational-memory-solution 

  24. EE Times. "XConn Shows Off First CXL Switch." 2024. https://www.eetimes.com/xconn-shows-off-first-cxl-switch/ 

  25. CXL Consortium. "Supercomputing 2025 Demonstrations." November 2025. https://computeexpresslink.org/event/supercomputing-2025/ 

  26. Business Wire. "Panmnesia Announces Sample Availability of PCIe 6.0/CXL 3.2 Fabric Switch." November 12, 2025. https://www.businesswire.com/news/home/20251112667725/en/Panmnesia-Announces-Sample-Availability-of-PCIe-6.0CXL-3.2-Fabric-Switch 

  27. Storage Newsletter. "FMS 2025: Astera Labs and SMART Modular CXL Demo." August 2025. https://www.storagenewsletter.com/2025/08/06/fms-2025-h3-platform-debuts-cxl-memory-sharing-and-pooling-solution/ 

  28. AnandTech. "Micron and Microchip CXL 2.0 Memory Module." 2024. 

  29. TechPowerUp. "Panmnesia Samples Industry's First PCIe 6.0/CXL 3.2 Fabric Switch." November 2025. 

  30. Yahoo Finance. "Marvell Extends CXL Ecosystem Leadership with Structera." September 2025. https://finance.yahoo.com/news/marvell-extends-cxl-ecosystem-leadership-130000255.html 

  31. Marvell. "Structera CXL Memory-Expansion Controllers Interoperability Announcement." September 2025. 

  32. GIGABYTE. "Revolutionizing the AI Factory: The Rise of CXL Memory Pooling." 2025. https://www.gigabyte.com/Article/revolutionizing-the-ai-factory-the-rise-of-cxl-memory-pooling 

  33. Storage Newsletter. "XConn Technologies and MemVerge CXL Memory Solution for KV Cache." November 2025. https://www.storagenewsletter.com/2025/11/21/sc25-xconn-technologies-and-memverge-to-deliver-breakthrough-scalable-cxl-memory-solution-to-offload-kv-cache-and-prefill-decode-disaggregation-in-ai-inference-workloads/ 

  34. NVIDIA. "Blackwell Architecture Technical Brief." 2024. 

  35. Wikipedia. "Compute Express Link - Linux Support." https://en.wikipedia.org/wiki/Compute_Express_Link 

  36. arXiv. "Exploring CXL-based KV Cache Storage for LLM Serving." 2024. https://mlforsystems.org/assets/papers/neurips2024/paper17.pdf 

  37. MemVerge. "Memory Machine Software for CXL." 2025. 

  38. Penguin Solutions. "Explaining CXL Memory 101: Expansion, Pooling, & Sharing." 2025. https://www.penguinsolutions.com/en-us/resources/blog/what-is-cxl-memory-expansion 

  39. NVIDIA. "Grace Hopper Superchip Architecture." 2024. 

  40. AMD. "Instinct MI300X Specifications." 2024. 

  41. Intel. "Data Center GPU Max (Ponte Vecchio) Documentation." 2024. 

  42. CXL Consortium. "CXL 4.0 Backward Compatibility Statement." November 2025. 

  43. Servers Simply. "Next-Gen HPC & AI Infrastructure 2025: GPUs, CXL, Gen5 NVMe." 2025. https://www.serversimply.com/blog/next-gen-hpc-and-ai-infrastructure-in-2025 

  44. Semi Engineering. "CXL Thriving As Memory Link." 2025. https://semiengineering.com/cxl-thriving-as-memory-link/ 

  45. HPCwire. "Everyone Except Nvidia Forms Ultra Accelerator Link Consortium." May 2024. https://www.hpcwire.com/2024/05/30/everyone-except-nvidia-forms-ultra-accelerator-link-ualink-consortium/ 

  46. ServeTheHome. "Huawei Presents UB-Mesh Interconnect for Large AI SuperNodes at Hot Chips 2025." August 2025. 

  47. arXiv. "Amplifying Effective CXL Memory Bandwidth for LLM Inference via Transparent Near-Data Processing." September 2025. https://arxiv.org/abs/2509.03377 

  48. CXL Consortium. "Advantages of CXL Memory Sharing for Emerging Applications." June 2025. https://computeexpresslink.org/wp-content/uploads/2025/06/CXL_Q2-2025-Webinar_FINAL.pdf 

Request a Quote_

Tell us about your project and we'll respond within 72 hours.

> TRANSMISSION_COMPLETE

Request Received_

Thank you for your inquiry. Our team will review your request and respond within 72 hours.

QUEUED FOR PROCESSING