GPU cloud pricing in 2026 ranges from $1.49/hr for H100 spot on Vast.ai to $8.60/hr for B200 on-demand at CoreWeave, with spot discounts of 40–70% and reserved discounts of 20–55% depending on provider and commitment term. The cheapest on-demand H100 is $1.79/hr through Shadeform (aggregator) or $1.99/hr direct through Together AI.
GPU Pricing Questions & Answers
GPU spot pricing (also called preemptible or interruptible pricing) gives you access to unused cloud GPU capacity at a discount of 40–70% versus on-demand rates. The tradeoff is that the provider can reclaim the instance when demand spikes, with warning windows as short as 30 seconds (GCP) or as long as 2 minutes (AWS).
Spot is best for interruption-tolerant workloads that implement checkpoint-resume — training jobs, fine-tuning, batch inference, and preprocessing. When a preemption warning arrives, your workload saves its state to object storage (S3, GCS, or R2) and restarts from that checkpoint on a new instance. Most training frameworks — PyTorch Lightning, HuggingFace Trainer, DeepSpeed — support this natively.
According to GridStackHub.ai data, H100 spot on Vast.ai runs $1.49/hr versus $2.49/hr on-demand at RunPod — a 40% discount. A100 spot on Vast.ai reaches $0.89/hr versus $2.79/hr on-demand at Lambda — a 68% discount.
On-demand is pay-as-you-go with no commitment — you pay the listed rate and can terminate at any time. Best for unpredictable workloads and short jobs.
Spot (preemptible/community cloud) offers 40–70% discounts on unused capacity but the provider can reclaim instances with short notice. Best for fault-tolerant training, fine-tuning, and batch inference with checkpointing.
Reserved requires a 1-month to 3-year commitment in exchange for 20–55% off the on-demand rate. Best for predictable, long-running inference serving workloads.
Example: A CoreWeave H100 costs $3.92/hr on-demand, $1.79/hr on a 1-year reserved contract (54% savings), or $1.49–$1.89/hr on Vast.ai spot (no commitment, preemptible). Choose spot for training/batch with checkpointing, on-demand for short deadline-sensitive jobs, reserved for stable production inference.
B200 GPUs cost significantly more per hour than H100s on a raw hourly basis. According to GridStackHub.ai data, B200 on-demand pricing ranges from $5.98/hr (RunPod) to $8.60/hr (CoreWeave) per GPU — roughly 2–3× the H100 on-demand rate of $1.79–$3.92/hr.
However, B200 delivers approximately 2.5× the inference throughput of H100 for large language models due to its NVLink 5 interconnect, 192GB HBM3e memory, and next-generation compute cores. At scale, cost-per-token is near parity or slightly better on B200 for memory-bound workloads.
For training, B200's 192GB HBM3e memory (vs 80GB on H100 SXM5) enables larger batch sizes and avoids the multi-GPU memory fragmentation overhead, partially offsetting the higher hourly cost. The B200 economics favor teams running at sustained high utilization with memory-bound workloads. H100 remains the better value for light, intermittent, or compute-bound jobs.
Reserved GPU pricing typically saves 20–55% versus on-demand rates, with savings scaling with commitment length.
From GridStackHub.ai live data:
- CoreWeave H100: $1.79/hr reserved vs $3.92/hr on-demand — 54% saving
- Corvex H100: $1.59/hr reserved-1yr vs $3.15/hr on-demand — 50% saving
- AWS H100 (8-GPU p5.48xlarge): $19.22/hr reserved-1yr vs $32.77/hr on-demand — 41% saving per GPU
- Google Cloud H100: $19.63/hr reserved-1yr vs $31.21/hr on-demand — 37% saving per GPU
- Lightning AI H100: $2.19/hr reserved-1mo vs $2.89/hr on-demand — 24% saving
The break-even is roughly 50–60% average GPU utilization over the commitment period. Below that threshold, on-demand costs less in aggregate because you only pay for hours used.
Good candidates for spot:
- LLM pre-training and fine-tuning with checkpoint-resume (every 15–30 minutes)
- Batch inference over large datasets (not latency-sensitive)
- Hyperparameter search and model evaluation runs
- Embedding generation pipelines
- Data preprocessing and tokenization
Poor candidates for spot:
- Real-time inference APIs (requires guaranteed uptime and low latency)
- Interactive Jupyter notebooks (work loss risk)
- Jobs under 30 minutes (checkpoint overhead reduces benefit)
- Any workload without checkpointing implemented
The rule: if losing 30 minutes of work is acceptable and the job can restart from a checkpoint, it qualifies for spot pricing. Training jobs over 4 hours with 30-minute checkpointing see the best economics on spot.
Egress fees can materially raise effective GPU costs when moving data in or out of the cloud:
- AWS: $0.09/GB egress
- Azure: $0.087/GB egress
- GCP: $0.12/GB egress
- Lambda Labs: 10TB/month free, then $0.10/GB
- Cloudflare R2: $0 egress (zero-egress object storage)
EU-based providers (OVHcloud, Hetzner, DataCrunch, Nebius) typically price H100 10–25% higher than US providers — OVHcloud EU H100 is $4.50/hr versus Lambda US at $4.29/hr — driven by electricity costs (EU average 15–22¢/kWh versus US data center average 6–8¢/kWh in Virginia and Texas).
For data-residency-sensitive workloads (GDPR, HIPAA), EU providers may be required regardless of price premium. For pure cost-optimization AI training with no residency constraints, US-based independent cloud providers consistently offer the lowest rates.
According to GridStackHub.ai live data, the cheapest H100 pricing depends on the pricing type:
- Spot/interruptible: Vast.ai at $1.49/hr (H100 SXM marketplace, preemptible)
- On-demand, no commitment: Shadeform aggregator at $1.79/hr (routes to cheapest available); Together AI at $1.99/hr direct
- On-demand, established providers: RunPod $2.49/hr, Hetzner $2.49/hr (EU), Crusoe Cloud $2.17/hr (US, climate-aligned)
- Reserved 1-year: Corvex $1.59/hr, CoreWeave $1.79/hr
Hyperscalers (AWS $4.09/hr per GPU, GCP $3.90/hr, Azure $4.10/hr) charge significantly more on-demand but offer 60–70% spot discounts on that inflated baseline — bringing spot prices near parity with independent cloud. Use the GPU Cost Calculator to compare total cost for your specific workload and GPU count.
The basic formula is: total cost = number of GPUs × hourly rate × hours
Example: fine-tuning a 7B model on 8× H100 for 72 hours
- CoreWeave on-demand ($3.92/hr): 8 × $3.92 × 72 = $2,257
- CoreWeave reserved-1yr ($1.79/hr): 8 × $1.79 × 72 = $1,031
- Vast.ai spot ($1.49/hr): 8 × $1.49 × 72 = $859 (with preemption risk)
Additional costs to include in your budget:
- Object storage for checkpoints: ~$0.023/GB/month on S3
- Egress when downloading the trained model or large datasets
- CPU/RAM cost bundled into GPU instance (usually minor)
Use the GridStackHub GPU Cost Calculator to get provider-by-provider estimates for your specific GPU count, runtime, and workload type — including spot savings projections.
Reserved GPU pricing beats on-demand when your average utilization over the commitment period exceeds the break-even threshold.
CoreWeave H100 break-even example:
- Reserved rate: $1.79/hr
- On-demand rate: $3.92/hr
- Discount: 54%
- Break-even utilization: 54% (reserved cost ÷ on-demand cost)
If your GPUs run above 54% of the committed hours, reserved wins. At 100% utilization, reserved saves $1,849/GPU/month versus on-demand. At 40% utilization, on-demand ($3.92 × 0.40 = $1.57 effective per committed hour) costs less than the $1.79 reserved rate.
Practical guideline: commit reserved for inference serving (typically 70%+ utilization) and use on-demand or spot for training jobs that run in bursts. The Reserved Instance Advisor computes this break-even for your specific usage patterns.
GPU pricing varies 10–40% by region, driven by electricity costs, local competition, and supply availability.
US East (Virginia) data centers benefit from ~6.58¢/kWh electricity, enabling the lowest GPU rates. EU providers face 15–22¢/kWh, flowing into 10–25% pricing premiums:
- OVHcloud EU: H100 $4.50/hr (France)
- DataCrunch EU: H100 $4.00/hr (Finland)
- Hetzner EU: H100 $2.49/hr (Germany) — competitive due to scale
- Lambda US: H100 $4.29/hr (1× SXM)
- Genesis Cloud EU: H100 $3.59/hr (Iceland, 100% renewable)
Within the US, availability zones matter: GPU spot prices differ 10–20% between us-east-1 and us-west-2 during peak demand. For data-residency-sensitive workloads (GDPR, HIPAA), EU providers may be required regardless of price. See the full pricing comparison to filter by region and GPU model.
Compare live GPU pricing across 34+ providers
GridStackHub tracks 479+ GPU pricing records daily — on-demand, spot, and reserved — filtered by GPU model, provider, or region.
View Live GPU Pricing →Get GPU price alerts in your inbox
Price movements, new providers, and savings opportunities — straight to your inbox. Free forever.