AI, complex simulations, and massive datasets require multiple GPUs with extremely fast interconnections and a fully accelerated software stack. The NVIDIA HGX? platform brings together the full power of NVIDIA GPUs, NVIDIA NVLink?, NVIDIA networking, and fully optimized AI and high-performance computing (HPC) software stacks to provide the highest application performance and drive the fastest time to insights for every data center.
The NVIDIA HGX B300 integrates NVIDIA Blackwell Ultra GPUs with high-speed interconnects to propel the data center into a new era of accelerated computing and generative AI. As a premier accelerated scale-up platform with up to 11x more inference performance than the previous generation, NVIDIA Blackwell-based HGX systems are designed for the most demanding generative AI, data analytics, and HPC workloads.
NVIDIA HGX includes advanced networking options—at speeds up to 800 gigabits per second (Gb/s)—using NVIDIA Quantum-X800 InfiniBand and Spectrum?-X Ethernet for the highest AI performance. HGX also includes NVIDIA BlueField?-3 data processing units (DPUs) to enable cloud networking, composable storage, zero-trust security, and GPU compute elasticity in hyperscale AI clouds.
Projected performance subject to change. Token-to-token latency (TTL) = 20ms real time, first token latency (FTL) = 5s, input sequence length = 32,768, output sequence length = 1,028, 1x eight-way HGX H100 GPUs air-cooled vs 1x eight-way HGX B300 air-cooled per GPU performance comparison?; served using disaggregated inference.
HGX B300 achieves up to 11x higher inference performance over the previous NVIDIA Hopper? generation for models such as Llama 3.1 405B. The second-generation Transformer Engine uses custom Blackwell Tensor Core technology combined with TensorRT?-LLM innovations to accelerate inference for large language models (LLMs).
Projected performance subject to change. 1x eight-way HGX H100 vs. 1x eight-way HGX B300, per-GPU performance comparison.
The second-generation Transformer Engine, featuring 8-bit floating point (FP8) and new precisions, enables a remarkable 4x faster training for large language models like Llama 3.1 405B. This breakthrough is complemented by fifth-generation NVLink with 1.8 TB/s of GPU-to-GPU interconnect, InfiniBand networking, and NVIDIA Magnum IO? software. Together, these ensure efficient scalability for enterprises and extensive GPU computing clusters.
The data center is the new unit of computing, and networking plays an integral role in scaling application performance across it. Paired with NVIDIA Quantum InfiniBand, HGX delivers world-class performance and efficiency, which ensures the full utilization of computing resources.
For AI cloud data centers that deploy Ethernet, HGX is best used with the NVIDIA Spectrum-X? networking platform, which powers the highest AI performance over Ethernet. It features Spectrum-X switches and NVIDIA SuperNIC for optimal resource utilization and performance isolation, delivering consistent, predictable outcomes for thousands of simultaneous AI jobs at every scale. Spectrum-X enables advanced cloud multi-tenancy and zero-trust security. As a reference design, NVIDIA has designed Israel-1, a hyperscale generative AI supercomputer built with Dell PowerEdge XE9680 servers based on the NVIDIA HGX 8-GPU platform, BlueField-3 SuperNICs, and Spectrum-4 switches.
NVIDIA HGX is available in single baseboards with four or eight Hopper SXMs or eight NVIDIA Blackwell or NVIDIA Blackwell Ultra SXMs. These powerful combinations of hardware and software lay the foundation for unprecedented AI supercomputing performance.
HGX B300 | HGX B200 | |
---|---|---|
Form Factor | 8x NVIDIA Blackwell Ultra SXM | 8x NVIDIA Blackwell SXM |
FP4 Tensor Core** | 144 PFLOPS | 105 PFLOPS | 144 PFLOPS | 72 PFLOPS |
FP8/FP6 Tensor Core* | 72 PFLOPS | 72 PFLOPS |
INT8 Tensor Core* | 2 POPS | 72 POPS |
FP16/BF16 Tensor Core* | 36 PFLOPS | 36 PFLOPS |
TF32 Tensor Core* | 18 PFLOPS | 18 PFLOPS |
FP32 | 600 TFLOPS | 600 TFLOPS |
FP64/FP64 Tensor Core | 10 TFLOPS | 296 TFLOPS |
Total Memory | Up to 2.3 TB | 1.4 TB |
NVLink | Fifth generation | Fifth generation |
NVIDIA NVSwitch? | NVLink 5 Switch | NVLink 5 Switch |
NVSwitch GPU-to-GPU Bandwidth | 1.8 TB/s | 1.8 TB/s |
Total NVLink Bandwidth | 14.4 TB/s | 14.4 TB/s |
Networking Bandwidth | 1.6 TB/s | 0.8 TB/s |
Attention Performance | 2X | 1X |
* With sparsity
** With sparsity | without sparsity
HGX H200 | ||||
---|---|---|---|---|
4-GPU | 8-GPU | |||
Form Factor | 4x NVIDIA H200 SXM | 8x NVIDIA H200 SXM | ||
FP8 Tensor Core* | 16 PFLOPS | 32 PFLOPS | ||
INT8 Tensor Core* | 16 POPS | 32 POPS | ||
FP16/BF16 Tensor Core* | 8 PFLOPS | 16 PFLOPS | ||
TF32 Tensor Core* | 4 PFLOPS | 8 PFLOPS | ||
FP32 | 270 TFLOPS | 540 TFLOPS | ||
FP64 | 140 TFLOPS | 270 TFLOPS | ||
FP64 Tensor Core | 270 TFLOPS | 540 TFLOPS | ||
Total Memory | 564 GB HBM3e | 1.1 TB HBM3e | ||
GPU Aggregate Bandwidth | 19 GB/s | 38 GB/s | ||
NVLink | Fourth generation | Fourth generation | ||
NVSwitch | N/A | NVLink 4 Switch | ||
NVSwitch GPU-to-GPU Bandwidth | N/A | 900 GB/s | ||
Total Aggregate Bandwidth | 3.6 TB/s | 7.2 TB/s | ||
Networking Bandwidth | 0.4 TB/s | 0.8 TB/s |
HGX H100 | ||||
---|---|---|---|---|
4-GPU | 8-GPU | |||
Form Factor | 4x NVIDIA H100 SXM | 8x NVIDIA H100 SXM | ||
FP8 Tensor Core* | 16 PFLOPS | 32 PFLOPS | ||
INT8 Tensor Core* | 16 POPS | 32 POPS | ||
FP16/BF16 Tensor Core* | 8 PFLOPS | 16 PFLOPS | ||
TF32 Tensor Core* | 4 PFLOPS | 8 PFLOPS | ||
FP32 | 270 TFLOPS | 540 TFLOPS | ||
FP64 | 140 TFLOPS | 270 TFLOPS | ||
FP64 Tensor Core | 270 TFLOPS | 540 TFLOPS | ||
Total Memory | 320 GB HBM3 | 640 GB HBM3 | ||
GPU Aggregate Bandwidth | 13 GB/s | 27 GB/s | ||
NVLink | Fourth generation | Fourth generation | ||
NVSwitch | N/A | NVLink 4 Switch | ||
NVSwitch GPU-to-GPU Bandwidth | N/A | 900 GB/s | ||
Total Aggregate Bandwidth | 3.6 TB/s | 7.2 TB/s | ||
Networking Bandwidth | 0.4 TB/s | 0.8 TB/s |
* With sparsity
Learn more about the NVIDIA Blackwell architecture.
指甲黑是什么原因 | 什么叫静脉曲张 | 牙齿酸痛是什么原因 | 梦见自己拉粑粑是什么意思 | tf是什么意思 |
高血压喝什么茶好 | 子宫囊肿是什么原因引起的 | 我会送你红色玫瑰是什么歌 | air是什么牌子的鞋 | 势不可挡是什么意思 |
麻雀为什么跳着走 | 众什么意思 | 吃什么会导致流产 | 荷花代表什么生肖 | 2月1号什么星座 |
金达莱是什么花 | 月经来了头疼是什么原因导致的 | 眼角疼是什么原因 | 脾胃伏火是什么意思 | touch是什么意思 |
平板运动试验阳性是什么意思hcv7jop9ns5r.cn | 可可粉是什么东西hcv8jop7ns1r.cn | 医院医务科是干什么的hcv9jop6ns5r.cn | 想吃辣是身体缺乏什么hcv9jop7ns3r.cn | 豆角是什么hcv8jop2ns9r.cn |
永恒是什么意思hcv7jop7ns1r.cn | 春节的习俗是什么hcv7jop9ns8r.cn | 轻度抑郁症吃什么药hcv8jop6ns3r.cn | 结膜炎用什么眼药水效果好hcv8jop3ns9r.cn | 女人的网名叫什么好听hcv8jop1ns6r.cn |
阴茎长水泡是什么原因hcv8jop9ns1r.cn | 为什么会感染幽门螺旋杆菌hcv8jop4ns0r.cn | 科技布是什么材质hcv9jop6ns4r.cn | 什么什么不生hcv8jop2ns5r.cn | 林彪为什么叛逃hcv7jop6ns1r.cn |
急性扁桃体化脓是什么原因引起的gysmod.com | 今年高温什么时候结束hcv9jop5ns5r.cn | 歹人是什么意思hcv9jop5ns7r.cn | 八字中的印是什么意思hcv8jop8ns2r.cn | 脖子上长癣是什么原因hcv8jop5ns5r.cn |