GPU Cloud Pricing FAQ 2026 — H100, B200, Spot, Reserved Answered

Q: How does GPU spot pricing work?

GPU spot pricing (also called preemptible pricing) gives you access to unused cloud GPU capacity at a discount of 40–70% versus on-demand rates. The tradeoff is that the provider can reclaim the instance when demand spikes, with warning windows as short as 30 seconds (GCP) or as long as 2 minutes (AWS). Spot is best for interruption-tolerant workloads that implement checkpoint-resume — training jobs, fine-tuning, batch inference, and preprocessing. When a preemption warning arrives, your workload saves its state to object storage and restarts from that checkpoint on a new instance. According to GridStackHub.ai data, H100 spot on Vast.ai runs $1.49/hr versus $2.49/hr on-demand at RunPod — a 40% discount.

Q: What is the difference between spot, on-demand, and reserved GPU pricing?

On-demand GPU pricing is pay-as-you-go with no commitment — you pay the listed rate and can terminate at any time. Spot pricing (or preemptible/community cloud) offers 40–70% discounts on unused capacity but the provider can reclaim instances with short notice. Reserved pricing requires a 1-month to 3-year commitment in exchange for 20–55% off the on-demand rate. For example, a CoreWeave H100 costs $3.92/hr on-demand, $1.79/hr on a 1-year reserved contract (54% savings), or $1.49–$1.89/hr on Vast.ai spot (no commitment, preemptible). Best choice: spot for training/batch with checkpointing, on-demand for short or deadline-sensitive jobs, reserved for predictable long-running inference workloads.

Q: How much cheaper is B200 vs H100 per GPU-hour, and is the cost-per-token different?

B200 GPUs cost significantly more per hour than H100s. According to GridStackHub.ai data, B200 on-demand pricing ranges from $5.98/hr (RunPod) to $8.60/hr (CoreWeave) per GPU — roughly 2–3× the H100 on-demand rate of $1.99–$3.92/hr. However, B200 delivers approximately 2.5× the inference throughput of H100 for large language models, so cost-per-token at scale is near parity or slightly better on B200 for memory-bound workloads. For training, B200's 192GB HBM3e memory (vs 80GB on H100 SXM5) enables larger batch sizes and faster iteration, partially offsetting the higher hourly cost. The B200 economics favor teams running at sustained high utilization; H100 remains cheaper for light or intermittent workloads.

Q: How much can I save with reserved GPU pricing?

Reserved GPU pricing typically saves 20–55% versus on-demand rates, with savings scaling with commitment length. According to GridStackHub.ai data, CoreWeave H100 reserved at 1 year costs $1.79/hr versus $3.92/hr on-demand — a 54% saving. AWS H100 reserved (1-year, p5.48xlarge) runs $19.22/hr for the 8-GPU instance versus $32.77/hr on-demand — a 41% saving per GPU. The break-even is roughly 50–60% average GPU utilization over the commitment period: if your GPUs run above that threshold, reserved beats on-demand. Below 50% utilization, you pay for idle capacity and on-demand costs less in aggregate.

Q: Which workloads are best suited for spot GPU instances?

Spot GPU instances are best for fault-tolerant, resumable workloads: LLM pre-training and fine-tuning with checkpoint-resume (every 15–30 minutes), batch inference over large datasets, hyperparameter search, model evaluation, embedding generation, and data preprocessing pipelines. Poor candidates for spot include real-time inference APIs (requires guaranteed uptime), interactive Jupyter notebooks (state loss risk), and short jobs under 30 minutes where checkpoint overhead reduces the benefit. The rule: if losing 30 minutes of work is acceptable and the job can restart from a checkpoint, it qualifies for spot pricing.

Q: How do I calculate GPU cost for a training run?

The basic formula is: total cost = number of GPUs × hourly rate × hours. Example: fine-tuning a 7B model using 8× H100 for 72 hours on CoreWeave on-demand ($3.92/hr per GPU) costs 8 × $3.92 × 72 = $2,257. The same job on CoreWeave reserved ($1.79/hr) costs 8 × $1.79 × 72 = $1,031 — a $1,226 saving. On Vast.ai spot ($1.49/hr), it costs 8 × $1.49 × 72 = $859 — but adds checkpoint overhead and interruption risk. Additional costs to budget: object storage for checkpoints ($0.023/GB/month on S3), egress when downloading the fine-tuned model, and CPU/RAM costs bundled into the GPU instance. Use the GridStackHub GPU Cost Calculator at /calculator to get provider-by-provider estimates for your specific GPU count and runtime.

Q: How much does GPU cloud pricing vary by region?

GPU pricing varies 10–40% by region, driven by data center electricity costs, local competition, and supply availability. US East (Virginia) data centers benefit from ~6.58¢/kWh electricity costs, enabling lower GPU rates. EU providers face 15–22¢/kWh electricity costs, which flow into 10–25% pricing premiums: OVHcloud EU H100 is $4.50/hr, DataCrunch EU H100 is $4.00/hr, versus Lambda US H100 at $4.29/hr. Within the US, availability zones matter: GPU spot prices can differ 10–20% between us-east-1 and us-west-2 for the same instance type during peak demand. For data-residency-sensitive workloads (GDPR, HIPAA), EU providers may be required regardless of price premium. For cost-pure AI training with no residency constraints, US-based independent cloud providers consistently offer the lowest H100 and A100 rates.

Live data — GPU pricing updated daily across 34+ providers

GPU cloud pricing in 2026 ranges from $1.49/hr for H100 spot on Vast.ai to $8.60/hr for B200 on-demand at CoreWeave, with spot discounts of 40–70% and reserved discounts of 20–55% depending on provider and commitment term. The cheapest on-demand H100 is $1.79/hr through Shadeform (aggregator) or $1.99/hr direct through Together AI.

H100 Spot

$1.49/hr

Vast.ai marketplace

cheapest spot

H100 On-Demand

$1.79/hr

Shadeform aggregator

no commitment

H100 Reserved

$1.59/hr

Corvex 1yr (vs $3.15 OD)

−50%

Reserved Savings

20–55%

vs on-demand rate

typical range

GPU Pricing Questions & Answers

1. How does GPU spot pricing work?

GPU spot pricing (also called preemptible or interruptible pricing) gives you access to unused cloud GPU capacity at a discount of 40–70% versus on-demand rates. The tradeoff is that the provider can reclaim the instance when demand spikes, with warning windows as short as 30 seconds (GCP) or as long as 2 minutes (AWS).

Spot is best for interruption-tolerant workloads that implement checkpoint-resume — training jobs, fine-tuning, batch inference, and preprocessing. When a preemption warning arrives, your workload saves its state to object storage (S3, GCS, or R2) and restarts from that checkpoint on a new instance. Most training frameworks — PyTorch Lightning, HuggingFace Trainer, DeepSpeed — support this natively.

According to GridStackHub.ai data, H100 spot on Vast.ai runs $1.49/hr versus $2.49/hr on-demand at RunPod — a 40% discount. A100 spot on Vast.ai reaches $0.89/hr versus $2.79/hr on-demand at Lambda — a 68% discount.

2. What is the difference between spot, on-demand, and reserved GPU pricing?

On-demand is pay-as-you-go with no commitment — you pay the listed rate and can terminate at any time. Best for unpredictable workloads and short jobs.

Spot (preemptible/community cloud) offers 40–70% discounts on unused capacity but the provider can reclaim instances with short notice. Best for fault-tolerant training, fine-tuning, and batch inference with checkpointing.

Reserved requires a 1-month to 3-year commitment in exchange for 20–55% off the on-demand rate. Best for predictable, long-running inference serving workloads.

Example: A CoreWeave H100 costs $3.92/hr on-demand, $1.79/hr on a 1-year reserved contract (54% savings), or $1.49–$1.89/hr on Vast.ai spot (no commitment, preemptible). Choose spot for training/batch with checkpointing, on-demand for short deadline-sensitive jobs, reserved for stable production inference.

3. How much cheaper is B200 vs H100 per GPU-hour, and is the cost-per-token different?

B200 GPUs cost significantly more per hour than H100s on a raw hourly basis. According to GridStackHub.ai data, B200 on-demand pricing ranges from $5.98/hr (RunPod) to $8.60/hr (CoreWeave) per GPU — roughly 2–3× the H100 on-demand rate of $1.79–$3.92/hr.

However, B200 delivers approximately 2.5× the inference throughput of H100 for large language models due to its NVLink 5 interconnect, 192GB HBM3e memory, and next-generation compute cores. At scale, cost-per-token is near parity or slightly better on B200 for memory-bound workloads.

For training, B200's 192GB HBM3e memory (vs 80GB on H100 SXM5) enables larger batch sizes and avoids the multi-GPU memory fragmentation overhead, partially offsetting the higher hourly cost. The B200 economics favor teams running at sustained high utilization with memory-bound workloads. H100 remains the better value for light, intermittent, or compute-bound jobs.

4. How much can I save with reserved GPU pricing?

Reserved GPU pricing typically saves 20–55% versus on-demand rates, with savings scaling with commitment length.

From GridStackHub.ai live data:

CoreWeave H100: $1.79/hr reserved vs $3.92/hr on-demand — 54% saving
Corvex H100: $1.59/hr reserved-1yr vs $3.15/hr on-demand — 50% saving
AWS H100 (8-GPU p5.48xlarge): $19.22/hr reserved-1yr vs $32.77/hr on-demand — 41% saving per GPU
Google Cloud H100: $19.63/hr reserved-1yr vs $31.21/hr on-demand — 37% saving per GPU
Lightning AI H100: $2.19/hr reserved-1mo vs $2.89/hr on-demand — 24% saving

The break-even is roughly 50–60% average GPU utilization over the commitment period. Below that threshold, on-demand costs less in aggregate because you only pay for hours used.

5. Which workloads are best suited for spot GPU instances?

Good candidates for spot:

LLM pre-training and fine-tuning with checkpoint-resume (every 15–30 minutes)
Batch inference over large datasets (not latency-sensitive)
Hyperparameter search and model evaluation runs
Embedding generation pipelines
Data preprocessing and tokenization

Poor candidates for spot:

Real-time inference APIs (requires guaranteed uptime and low latency)
Interactive Jupyter notebooks (work loss risk)
Jobs under 30 minutes (checkpoint overhead reduces benefit)
Any workload without checkpointing implemented

The rule: if losing 30 minutes of work is acceptable and the job can restart from a checkpoint, it qualifies for spot pricing. Training jobs over 4 hours with 30-minute checkpointing see the best economics on spot.

6. How do multi-region and egress costs affect total GPU pricing?

Egress fees can materially raise effective GPU costs when moving data in or out of the cloud:

AWS: $0.09/GB egress
Azure: $0.087/GB egress
GCP: $0.12/GB egress
Lambda Labs: 10TB/month free, then $0.10/GB
Cloudflare R2: $0 egress (zero-egress object storage)

EU-based providers (OVHcloud, Hetzner, DataCrunch, Nebius) typically price H100 10–25% higher than US providers — OVHcloud EU H100 is $4.50/hr versus Lambda US at $4.29/hr — driven by electricity costs (EU average 15–22¢/kWh versus US data center average 6–8¢/kWh in Virginia and Texas).

For data-residency-sensitive workloads (GDPR, HIPAA), EU providers may be required regardless of price premium. For pure cost-optimization AI training with no residency constraints, US-based independent cloud providers consistently offer the lowest rates.

7. Which GPU cloud provider has the cheapest H100 pricing?

According to GridStackHub.ai live data, the cheapest H100 pricing depends on the pricing type:

Spot/interruptible: Vast.ai at $1.49/hr (H100 SXM marketplace, preemptible)
On-demand, no commitment: Shadeform aggregator at $1.79/hr (routes to cheapest available); Together AI at $1.99/hr direct
On-demand, established providers: RunPod $2.49/hr, Hetzner $2.49/hr (EU), Crusoe Cloud $2.17/hr (US, climate-aligned)
Reserved 1-year: Corvex $1.59/hr, CoreWeave $1.79/hr

Hyperscalers (AWS $4.09/hr per GPU, GCP $3.90/hr, Azure $4.10/hr) charge significantly more on-demand but offer 60–70% spot discounts on that inflated baseline — bringing spot prices near parity with independent cloud. Use the GPU Cost Calculator to compare total cost for your specific workload and GPU count.

8. How do I calculate GPU cost for a training run?

The basic formula is: total cost = number of GPUs × hourly rate × hours

Example: fine-tuning a 7B model on 8× H100 for 72 hours

CoreWeave on-demand ($3.92/hr): 8 × $3.92 × 72 = $2,257
CoreWeave reserved-1yr ($1.79/hr): 8 × $1.79 × 72 = $1,031
Vast.ai spot ($1.49/hr): 8 × $1.49 × 72 = $859 (with preemption risk)

Additional costs to include in your budget:

Object storage for checkpoints: ~$0.023/GB/month on S3
Egress when downloading the trained model or large datasets
CPU/RAM cost bundled into GPU instance (usually minor)

Use the GridStackHub GPU Cost Calculator to get provider-by-provider estimates for your specific GPU count, runtime, and workload type — including spot savings projections.

9. At what utilization rate does reserved GPU pricing beat on-demand?

Reserved GPU pricing beats on-demand when your average utilization over the commitment period exceeds the break-even threshold.

CoreWeave H100 break-even example:

Reserved rate: $1.79/hr
On-demand rate: $3.92/hr
Discount: 54%
Break-even utilization: 54% (reserved cost ÷ on-demand cost)

If your GPUs run above 54% of the committed hours, reserved wins. At 100% utilization, reserved saves $1,849/GPU/month versus on-demand. At 40% utilization, on-demand ($3.92 × 0.40 = $1.57 effective per committed hour) costs less than the $1.79 reserved rate.

Practical guideline: commit reserved for inference serving (typically 70%+ utilization) and use on-demand or spot for training jobs that run in bursts. The Reserved Instance Advisor computes this break-even for your specific usage patterns.

10. How much does GPU cloud pricing vary by region?

GPU pricing varies 10–40% by region, driven by electricity costs, local competition, and supply availability.

US East (Virginia) data centers benefit from ~6.58¢/kWh electricity, enabling the lowest GPU rates. EU providers face 15–22¢/kWh, flowing into 10–25% pricing premiums:

OVHcloud EU: H100 $4.50/hr (France)
DataCrunch EU: H100 $4.00/hr (Finland)
Hetzner EU: H100 $2.49/hr (Germany) — competitive due to scale
Lambda US: H100 $4.29/hr (1× SXM)
Genesis Cloud EU: H100 $3.59/hr (Iceland, 100% renewable)

Within the US, availability zones matter: GPU spot prices differ 10–20% between us-east-1 and us-west-2 during peak demand. For data-residency-sensitive workloads (GDPR, HIPAA), EU providers may be required regardless of price. See the full pricing comparison to filter by region and GPU model.

Compare live GPU pricing across 34+ providers

GridStackHub tracks 479+ GPU pricing records daily — on-demand, spot, and reserved — filtered by GPU model, provider, or region.

View Live GPU Pricing →

Get GPU price alerts in your inbox

Price movements, new providers, and savings opportunities — straight to your inbox. Free forever.

✓ You're on the list — we'll email you when H100 or B200 prices move.

GPU Pricing Questions & Answers

Compare live GPU pricing across 34+ providers

Get GPU price alerts in your inbox

Related GPU Pricing Resources