
# Vera Rubin NVL72 Detailed Summary - CPU: 36 Vera CPUs (88-core Arm-based Olympus custom cores, supports up to 1.5TB LPDDR5X memory) - GPU: 72 Rubin GPUs (Equipped with HBM4, 288GB per GPU, Transformer Engine support) - Superchip unit: 1 Vera CPU + 2 Rubin GPUs combined - Other chips: Extreme co-design of 6 chips including NVLink 6 switches, ConnectX-9 SuperNIC, BlueField-4 DPU, and Spectrum-6 Ethernet switches ## Performance (vs. Blackwell) - Inference: 5x improvement (3.6 EFLOPS based on NVFP4, 50 PFLOPS per GPU) - Training: 3.5x improvement (2.5 EFLOPS based on NVFP4) - Cost per token: 1/10th for MoE models (Massive reduction in inference costs) - MoE model training: Required GPUs reduced to 1/4 - Memory: 20.7TB HBM4 + 54TB LPDDR5X - Bandwidth: 3.6 TB/s per GPU via NVLink 6, 260 TB/s for the entire rack (Exceeds total internet bandwidth)
They're using HBM4, which is multiple layers of RAM stacked together, in 'terabyte' units. With 288GB per GPU in this system architecture, it'd be more shocking if we *didn't* have a RAM shortage.
"Netizens are in awe (and fear) of NVIDIA's new memory-hungry monster that's basically vacuuming up the world's RAM supply, leaving nothing for the rest of us."
#MixedContinue Browsing