Real-time NVIDIA B200 cloud pricing from 6 providers. Cheapest on-demand: $5.29/hr (Lambda). Updated daily by GridStackHub.
Sorted by cheapest per-GPU hourly rate. Includes on-demand, spot, and reserved pricing where available.
| Provider | Price/hr | Type | Region | VRAM | Updated |
|---|---|---|---|---|---|
|
LambdaLowest
1x B200 SXM
|
$5.29/hr | On-Demand | US | 192 GB | Sun Apr 12 2026 00:00:00 GMT+0000 (Coordinated Universal Time) |
|
CoreWeave
B200 SXM (Early Access)
|
$5.49/hr | On-Demand | US | 192 GB | Sun Apr 12 2026 00:00:00 GMT+0000 (Coordinated Universal Time) |
|
RunPod
NVIDIA B200
|
$5.98/hr | On-Demand | us-east-1 | 180 GB | Wed Apr 22 2026 00:00:00 GMT+0000 (Coordinated Universal Time) |
|
Google Cloud
a4-highgpu-8g (8x B200)
|
$6.60/hr $52.80/hr for 8× node |
On-Demand | us-central1 | 192 GB | Sun Apr 12 2026 00:00:00 GMT+0000 (Coordinated Universal Time) |
|
AWS
p6.48xlarge (8x B200)
|
$6.90/hr $55.20/hr for 8× node |
On-Demand | us-east-1 | 192 GB | Sun Apr 12 2026 00:00:00 GMT+0000 (Coordinated Universal Time) |
|
Azure
ND B200 v6 (8x B200)
|
$7.05/hr $56.40/hr for 8× node |
On-Demand | East US | 192 GB | Sun Apr 12 2026 00:00:00 GMT+0000 (Coordinated Universal Time) |
Key hardware specifications for the NVIDIA B200.
| Architecture | Blackwell (SXM) |
| VRAM | 192GB HBM3e |
| Memory Bandwidth | 8.0 TB/s |
| FP8 Throughput | 9,000 TFLOPS |
| FP16 Throughput | 4,500 TFLOPS |
| GPU-to-GPU Bandwidth | 1.8 TB/s (NVLink 5) |
| TDP | 1,000W |
| Gen | 5th Gen NVLink |
The NVIDIA B200 is the current flagship GPU, built on NVIDIA's Blackwell architecture. With 192GB HBM3e memory and 9,000 TFLOPS FP8 compute, it delivers 2.27× the throughput of an H200 and 2.27× that of an H100 SXM5 in memory-bandwidth-bound workloads.
B200 supply remains constrained in April 2026. Lambda ($5.29/hr) and CoreWeave ($5.49/hr) are the most accessible on-demand providers. Hyperscaler pricing — AWS at $6.90/GPU, Google Cloud at $6.60/GPU, Azure at $7.05/GPU — reflects normalized per-GPU rates from 8-GPU nodes.
The 192GB HBM3e enables running frontier-size models on a single GPU. For inference on 70B–405B parameter models, B200 on-demand eliminates multi-GPU tensor parallelism in many configurations, simplifying deployment and reducing interconnect overhead.
NVLink 5 interconnect on B200 nodes provides 1.8 TB/s GPU-to-GPU bandwidth — double H100 SXM5 NVSwitch bandwidth. For dense training on 1T+ parameter models, the B200's bandwidth advantage compounds across nodes. CoreWeave and Lambda have the deepest current B200 inventory.
Use GridStackHub's GPU cost calculator to get a ranked comparison with hidden-cost breakdown (egress + storage) across all providers.
📊 Open Calculator View All GPU Pricing