Price Comparison: Providers With Both H100 and B200

The table below shows per-GPU on-demand pricing for providers that currently offer both H100 SXM5 and B200 capacity. Hyperscaler prices are normalised to per-GPU (e.g. AWS's 8-GPU node rate ÷ 8). Data pulled live from GridStackHub.ai database.

Provider H100 SXM5 ($/GPU/hr) B200 ($/GPU/hr) B200/H100 Ratio
LambdaBest Value $1.99/hr $5.29/hr 2.7×
CoreWeave $2.23/hr $5.49/hr 2.5×
RunPod $1.99/hr $5.98/hr 3.0×
Google Cloud $3.90/hr $6.60/hr 1.7×
AWS $4.10/hr $6.90/hr 1.7×
Azure $4.10/hr $7.05/hr 1.7×

On-demand pricing only. Spot/reserved rates lower. See all pricing →

Full H100 SXM5 On-Demand Prices (All Providers)

H100 SXM5 on-demand currently ranges from $1.79/hr (Shadeform) to ~$3.09/hr (Paperspace). Spot pricing from aggregators like Vast.ai can go as low as $1.35–1.49/hr with preemption risk.

ProviderPrice ($/GPU/hr)InstanceRegion
Shadeform $1.79/hr ✓ H100 SXM (best price) Various
Together AI $1.99/hr H100 (Reserved Instances) US
RunPod $1.99/hr NVIDIA H100 PCIe us-east-1
Lambda $1.99/hr 1x H100 SXM US
Crusoe Energy $2.06/hr H100 SXM (Climate-Aligned) US (Texas)
TensorDock $2.09/hr H100 SXM 80GB US/EU
FluidStack $2.15/hr H100 SXM5 80GB US/EU
Crusoe Cloud $2.17/hr h100-80gb-sxm-ib-1x us-central

Full B200 On-Demand Prices (All Providers)

B200 availability is still constrained in April 2026. Lambda and CoreWeave offer the most accessible single-GPU pricing; hyperscalers are available primarily as 8-GPU nodes through committed-use agreements.

ProviderPrice ($/GPU/hr)InstanceRegion
Lambda $5.29/hr ✓ 1x B200 SXM US
CoreWeave $5.49/hr B200 SXM (Early Access) US
RunPod $5.98/hr NVIDIA B200 us-east-1
Google Cloud $6.60/hr a4-highgpu-8g (8x B200) us-central1
AWS $6.90/hr p6.48xlarge (8x B200) us-east-1
Azure $7.05/hr ND B200 v6 (8x B200) East US

Which Should You Rent? A Workload-Based Decision Guide

The 3.0× price premium on B200 is only justified if your workload extracts enough throughput advantage. Here's how to decide:

Choose H100 SXM5 if…

  • Running low-to-medium utilisation inference (GPU under 60% busy)
  • Fine-tuning models under 13B parameters where you don't hit memory bandwidth limits
  • Budget-constrained — H100 delivers the best absolute dollar efficiency for most use cases
  • Need immediate on-demand availability without waitlists
  • Running short experiments or dev/test workloads
  • Your framework/library doesn't yet have optimised Blackwell kernels

Choose B200 if…

  • Running high-throughput inference on 70B+ parameter models where bandwidth is the bottleneck
  • Training large models where iteration speed reduces time-to-result more than GPU-hours matter
  • Your batch sizes are large enough to saturate H100's 3.35 TB/s bandwidth consistently
  • Serving multiple concurrent inference requests where B200's larger KV cache is decisive
  • You need to fit a 405B+ parameter model on fewer GPUs (192GB vs 80GB VRAM)
  • Running at sustained high utilisation where throughput advantage > 3.0× GPU-hour cost

Break-even rule of thumb: If B200's throughput advantage on your workload means you need fewer than 3.0 B200s for every H100 you'd otherwise rent, B200 wins on total cost. For memory-bandwidth-bound 70B inference at high batch sizes, expect 2–2.5× throughput improvement — not enough to fully close the 3.0× gap. H100 typically wins unless utilisation is very high.

Spec Comparison: H100 SXM5 vs B200

SpecificationH100 SXM5B200 SXMB200 Advantage
FP8 Throughput3,958 TFLOPS9,000 TFLOPS+127% (2.27×)
BF16 Throughput1,979 TFLOPS4,500 TFLOPS+127% (2.27×)
Memory Bandwidth3.35 TB/s8.0 TB/s+139% (2.39×)
VRAM80GB HBM3192GB HBM3e+140% (2.4×)
ArchitectureHopper (H100)Blackwell (B200)
On-Demand Price$1.79/hr$5.29/hr3.0× more expensive

Provider Availability by GPU

H100 SXM5 — Wide Availability

H100 is the most widely available datacenter GPU in the cloud market. 27+ providers offer on-demand access with no waitlist.

Shadeform

$1.79/hr · Various

Together AI

$1.99/hr · US

RunPod

$1.99/hr · us-east-1

Lambda

$1.99/hr · US

Crusoe Energy

$2.06/hr · US (Texas)

TensorDock

$2.09/hr · US/EU

B200 — Limited, Growing Availability

B200 supply is constrained in April 2026. 6 providers offer access — primarily Lambda, CoreWeave, and hyperscalers through committed-use contracts.

Lambda

$5.29/hr · US

CoreWeave

$5.49/hr · US

RunPod

$5.98/hr · us-east-1

Google Cloud

$6.60/hr · us-central1

AWS

$6.90/hr · us-east-1

Azure

$7.05/hr · East US

Run Your Workload Through the Calculator

Compare H100 vs B200 total cost for your exact GPU-hours, batch size, and utilisation rate. Takes 30 seconds.

Open GPU Cost Calculator → Full Pricing Table

Frequently Asked Questions

As of April 2026, the cheapest H100 SXM5 on-demand is $1.79/hr at Shadeform, while the cheapest B200 is $5.29/hr at Lambda. That's a 3.0× price premium for B200. For 8-GPU cluster nodes: AWS H100 (p5.48xlarge) runs ~$32.77/hr (~$4.10/GPU), while AWS B200 (p6.48xlarge) runs ~$55.20/hr (~$6.90/GPU). GridStackHub tracks these prices daily across 58+ providers — prices are updated automatically.

For most workloads, no — H100 remains the better value at $1.79/hr. B200's ~2.27× compute and ~2.39× memory bandwidth advantage doesn't fully offset the 3.0× price gap except in specific scenarios: continuous high-throughput batch inference on 70B+ parameter models, very large distributed training jobs where iteration speed matters more than GPU-hours, or workloads where B200's 192GB VRAM (vs H100's 80GB) eliminates model sharding overhead. If your GPU is under 70% utilised, H100 wins on absolute cost. The crossover point is high and depends on workload characteristics — use the GPU cost calculator to model your specific case.

As of April 2026, providers carrying both H100 and B200 on-demand include: Lambda, CoreWeave, RunPod, Google Cloud, AWS, Azure. Lambda is the standout option for single-GPU access to both — H100 from ~$1.99/hr, B200 from ~$5.29/hr — with no waitlist on H100 and generally fast access to B200. Hyperscalers (AWS, Google, Azure) offer both but primarily as 8-GPU cluster nodes with higher effective per-GPU rates.

For memory-bandwidth-bound decoding (the most common LLM inference bottleneck), expect 2–2.5× more tokens/second on B200 vs H100 SXM5. This comes primarily from B200's 2.39× higher memory bandwidth (8.0 TB/s vs 3.35 TB/s), which determines how fast the GPU can load model weights per forward pass. For prefill (compute-bound), the 2.27× FP8 advantage applies more directly. Real-world gains depend on batch size and sequence length — larger batches see closer to the theoretical maximum. B200's 192GB VRAM also allows serving larger models or larger KV caches, which can improve throughput per GPU beyond the raw bandwidth numbers.

Multiple providers offer H100 SXM5 on-demand with no minimum commitment: Shadeform ($1.79/hr), Together AI ($1.99/hr), RunPod ($1.99/hr), Lambda ($1.99/hr), Crusoe Energy ($2.06/hr). The cheapest on-demand H100 is currently $1.79/hr at Shadeform. GPU aggregators like Shadeform route to whichever cloud has the lowest available rate at booking time. For spot pricing (preemptible), Vast.ai and RunPod often go below $1.50/hr but with interruption risk. GridStackHub's cheapest GPU comparison tracks real-time prices across all providers.