H100 vs B200: Real Cloud Rental Cost Comparison (April 2026)

Q: What is the price difference between H100 and B200 cloud rentals?

As of April 2026, the cheapest H100 SXM5 on-demand is $1.79/hr (Shadeform), while the cheapest B200 is $5.29/hr (Lambda). That puts B200 at approximately 3.0× the hourly cost of H100. For 8-GPU cluster jobs, AWS B200 (p6.48xlarge) runs ~$55.20/hr ($6.90/GPU) vs. AWS H100 (p5.48xlarge) at ~$32.77/hr ($4.10/GPU). Data updated daily by GridStackHub.ai.

Q: Is B200 worth the premium over H100?

It depends entirely on utilisation and workload type. The B200 delivers ~2.5× the FP8 throughput of H100 SXM5 (9,000 vs 3,958 TFLOPS) and 2.4× the memory bandwidth (8.0 TB/s vs 3.35 TB/s). If your workload is memory-bandwidth-bound — large-model inference or distributed training — B200 can reduce your per-token or per-iteration cost by 30–50% despite the 3.0× higher hourly rate. For low-utilisation inference, experimentation, or fine-tuning small models, H100 remains cheaper in absolute dollars. The crossover: if B200's throughput advantage reduces your required GPU count by more than 3.0×, B200 wins on total cost.

Q: Which providers offer both H100 and B200?

As of April 2026, the following providers carry both H100 and B200 on-demand capacity: Lambda, CoreWeave, RunPod, Google Cloud, AWS, Azure. Lambda is notable for carrying both at competitive single-GPU prices (H100 from ~$1.99/hr, B200 from ~$5.29/hr). Hyperscalers AWS, Google Cloud, and Azure all offer both but typically only as 8-GPU cluster nodes — per-GPU pricing on hyperscalers is higher than specialist GPU clouds. See GridStackHub.ai for live pricing across all providers.

GridStackHub Research

Price Comparison: Providers With Both H100 and B200

The table below shows per-GPU on-demand pricing for providers that currently offer both H100 SXM5 and B200 capacity. Hyperscaler prices are normalised to per-GPU (e.g. AWS's 8-GPU node rate ÷ 8). Data pulled live from GridStackHub.ai database.

Provider	H100 SXM5 ($/GPU/hr)	B200 ($/GPU/hr)	B200/H100 Ratio
LambdaBest Value	$1.99/hr	$5.29/hr	2.7×
CoreWeave	$2.23/hr	$5.49/hr	2.5×
RunPod	$1.99/hr	$5.98/hr	3.0×
Google Cloud	$3.90/hr	$6.60/hr	1.7×
AWS	$4.10/hr	$6.90/hr	1.7×
Azure	$4.10/hr	$7.05/hr	1.7×

On-demand pricing only. Spot/reserved rates lower. See all pricing →

Full H100 SXM5 On-Demand Prices (All Providers)

H100 SXM5 on-demand currently ranges from $1.79/hr (Shadeform) to ~$3.09/hr (Paperspace). Spot pricing from aggregators like Vast.ai can go as low as $1.35–1.49/hr with preemption risk.

Provider	Price ($/GPU/hr)	Instance	Region
Shadeform	$1.79/hr ✓	H100 SXM (best price)	Various
Together AI	$1.99/hr	H100 (Reserved Instances)	US
RunPod	$1.99/hr	NVIDIA H100 PCIe	us-east-1
Lambda	$1.99/hr	1x H100 SXM	US
Crusoe Energy	$2.06/hr	H100 SXM (Climate-Aligned)	US (Texas)
TensorDock	$2.09/hr	H100 SXM 80GB	US/EU
FluidStack	$2.15/hr	H100 SXM5 80GB	US/EU
Crusoe Cloud	$2.17/hr	h100-80gb-sxm-ib-1x	us-central

Full B200 On-Demand Prices (All Providers)

B200 availability is still constrained in April 2026. Lambda and CoreWeave offer the most accessible single-GPU pricing; hyperscalers are available primarily as 8-GPU nodes through committed-use agreements.

Provider	Price ($/GPU/hr)	Instance	Region
Lambda	$5.29/hr ✓	1x B200 SXM	US
CoreWeave	$5.49/hr	B200 SXM (Early Access)	US
RunPod	$5.98/hr	NVIDIA B200	us-east-1
Google Cloud	$6.60/hr	a4-highgpu-8g (8x B200)	us-central1
AWS	$6.90/hr	p6.48xlarge (8x B200)	us-east-1
Azure	$7.05/hr	ND B200 v6 (8x B200)	East US

Which Should You Rent? A Workload-Based Decision Guide

The 3.0× price premium on B200 is only justified if your workload extracts enough throughput advantage. Here's how to decide:

Choose H100 SXM5 if…

Running low-to-medium utilisation inference (GPU under 60% busy)
Fine-tuning models under 13B parameters where you don't hit memory bandwidth limits
Budget-constrained — H100 delivers the best absolute dollar efficiency for most use cases
Need immediate on-demand availability without waitlists
Running short experiments or dev/test workloads
Your framework/library doesn't yet have optimised Blackwell kernels

Choose B200 if…

Running high-throughput inference on 70B+ parameter models where bandwidth is the bottleneck
Training large models where iteration speed reduces time-to-result more than GPU-hours matter
Your batch sizes are large enough to saturate H100's 3.35 TB/s bandwidth consistently
Serving multiple concurrent inference requests where B200's larger KV cache is decisive
You need to fit a 405B+ parameter model on fewer GPUs (192GB vs 80GB VRAM)
Running at sustained high utilisation where throughput advantage > 3.0× GPU-hour cost

Break-even rule of thumb: If B200's throughput advantage on your workload means you need fewer than 3.0 B200s for every H100 you'd otherwise rent, B200 wins on total cost. For memory-bandwidth-bound 70B inference at high batch sizes, expect 2–2.5× throughput improvement — not enough to fully close the 3.0× gap. H100 typically wins unless utilisation is very high.

Spec Comparison: H100 SXM5 vs B200

Specification	H100 SXM5	B200 SXM	B200 Advantage
FP8 Throughput	3,958 TFLOPS	9,000 TFLOPS	+127% (2.27×)
BF16 Throughput	1,979 TFLOPS	4,500 TFLOPS	+127% (2.27×)
Memory Bandwidth	3.35 TB/s	8.0 TB/s	+139% (2.39×)
VRAM	80GB HBM3	192GB HBM3e	+140% (2.4×)
Architecture	Hopper (H100)	Blackwell (B200)	—
On-Demand Price	$1.79/hr	$5.29/hr	3.0× more expensive

Provider Availability by GPU

H100 SXM5 — Wide Availability

H100 is the most widely available datacenter GPU in the cloud market. 27+ providers offer on-demand access with no waitlist.

Shadeform

$1.79/hr · Various

Together AI

$1.99/hr · US

RunPod

$1.99/hr · us-east-1

Lambda

$1.99/hr · US

Crusoe Energy

$2.06/hr · US (Texas)

TensorDock

$2.09/hr · US/EU

B200 — Limited, Growing Availability

B200 supply is constrained in April 2026. 6 providers offer access — primarily Lambda, CoreWeave, and hyperscalers through committed-use contracts.

Lambda

$5.29/hr · US

CoreWeave

$5.49/hr · US

RunPod

$5.98/hr · us-east-1

Google Cloud

$6.60/hr · us-central1

AWS

$6.90/hr · us-east-1

Azure

$7.05/hr · East US

Run Your Workload Through the Calculator

Compare H100 vs B200 total cost for your exact GPU-hours, batch size, and utilisation rate. Takes 30 seconds.

Open GPU Cost Calculator → Full Pricing Table

Frequently Asked Questions

What is the price difference between H100 and B200 cloud rentals?

As of April 2026, the cheapest H100 SXM5 on-demand is $1.79/hr at Shadeform, while the cheapest B200 is $5.29/hr at Lambda. That's a 3.0× price premium for B200. For 8-GPU cluster nodes: AWS H100 (p5.48xlarge) runs ~$32.77/hr (~$4.10/GPU), while AWS B200 (p6.48xlarge) runs ~$55.20/hr (~$6.90/GPU). GridStackHub tracks these prices daily across 58+ providers — prices are updated automatically.

Is B200 worth the premium over H100?

For most workloads, no — H100 remains the better value at $1.79/hr. B200's ~2.27× compute and ~2.39× memory bandwidth advantage doesn't fully offset the 3.0× price gap except in specific scenarios: continuous high-throughput batch inference on 70B+ parameter models, very large distributed training jobs where iteration speed matters more than GPU-hours, or workloads where B200's 192GB VRAM (vs H100's 80GB) eliminates model sharding overhead. If your GPU is under 70% utilised, H100 wins on absolute cost. The crossover point is high and depends on workload characteristics — use the GPU cost calculator to model your specific case.

Which providers offer both H100 and B200?

As of April 2026, providers carrying both H100 and B200 on-demand include: Lambda, CoreWeave, RunPod, Google Cloud, AWS, Azure. Lambda is the standout option for single-GPU access to both — H100 from ~$1.99/hr, B200 from ~$5.29/hr — with no waitlist on H100 and generally fast access to B200. Hyperscalers (AWS, Google, Azure) offer both but primarily as 8-GPU cluster nodes with higher effective per-GPU rates.

How much faster is B200 vs H100 for LLM inference?

For memory-bandwidth-bound decoding (the most common LLM inference bottleneck), expect 2–2.5× more tokens/second on B200 vs H100 SXM5. This comes primarily from B200's 2.39× higher memory bandwidth (8.0 TB/s vs 3.35 TB/s), which determines how fast the GPU can load model weights per forward pass. For prefill (compute-bound), the 2.27× FP8 advantage applies more directly. Real-world gains depend on batch size and sequence length — larger batches see closer to the theoretical maximum. B200's 192GB VRAM also allows serving larger models or larger KV caches, which can improve throughput per GPU beyond the raw bandwidth numbers.

What H100 providers have on-demand access with no commitment?

Multiple providers offer H100 SXM5 on-demand with no minimum commitment: Shadeform ($1.79/hr), Together AI ($1.99/hr), RunPod ($1.99/hr), Lambda ($1.99/hr), Crusoe Energy ($2.06/hr). The cheapest on-demand H100 is currently $1.79/hr at Shadeform. GPU aggregators like Shadeform route to whichever cloud has the lowest available rate at booking time. For spot pricing (preemptible), Vast.ai and RunPod often go below $1.50/hr but with interruption risk. GridStackHub's cheapest GPU comparison tracks real-time prices across all providers.