Reserved vs Spot vs On-Demand GPU: Which Saves More? (2026)

Side-by-Side Comparison (H100, CoreWeave, 2026)

Model	Hourly Rate	Monthly (720hr)	Annual	Savings vs OD
On-Demand	$2.23/hr	$1,606	$19,547	Baseline
Reserved 1-yr	$1.79/hr	$1,289	$15,680	−20%
Spot (Vast.ai)	~$1.49/hr*	~$1,073	~$13,052	−33%*

*Spot pricing fluctuates. Listed rate is typical, not guaranteed. Spot instances may be interrupted.

On-Demand: Maximum Flexibility, Highest Cost

On-demand is pay-as-you-go with no commitments. You can start, stop, and resize instances at any time. Providers guarantee availability (within reason) — you won't lose the instance unless you stop it.

Best for: Development, experimentation, short training runs (< 1 week), variable-load inference. Any workload where flexibility is worth paying a premium for.

Worst for: Long-running training jobs, production inference with known traffic patterns. You're paying the maximum rate continuously.

Reserved: Best Economics for Steady Workloads

Reserved pricing commits you to paying for capacity for 1 or 3 years. In exchange, you get a 20–50% discount. The GPU is reserved for you — you won't be preempted, and you're guaranteed availability.

Break-even math: 1-year reserved at CoreWeave saves $0.44/hr vs on-demand. That's $3,854/year per GPU. If you use the GPU for > 8,760 × (savings / premium) = effectively any steady workload, reserved wins.

Best for: Production inference endpoints, ongoing training pipelines, teams with predictable GPU needs exceeding 6 months.

Reserved pricing tip: Start on-demand to validate your workload needs, then convert to reserved once you have 30–60 days of data on actual utilization. Most providers allow mid-cycle conversion.

Spot: Maximum Savings, Requires Engineering

Spot instances use excess cloud capacity sold at steep discounts. The key constraint: they can be reclaimed with short notice (30 seconds to 2 minutes typically). That's the price for 33–80% off.

Making spot work:

Checkpoint training jobs every 10–30 minutes
Use distributed training across multiple spot instances (interrupting one doesn't kill the job)
Implement automatic job re-submission on interruption
Store training state on durable object storage (S3/R2), not local disk

Best for: Large training jobs that can be paused and resumed, embedding generation, batch offline inference.

Read: Full Spot GPU Pricing Guide →

Decision Framework

Scenario	Recommended Model	Reason
Development / experimentation	On-demand	Flexibility beats cost at low hours
Training run < 1 week	On-demand or Spot	Spot if fault-tolerant, OD if not
Training run > 1 month	Spot or Reserved	Spot for max savings; reserved if reliability critical
Production inference (< 10k req/day)	On-demand	Variable load, flexibility needed
Production inference (> 50k req/day)	Reserved	Predictable load, need guaranteed capacity
Batch embedding generation	Spot	Interruptible, maximize cost savings
Compliance / SLA required	Reserved	Guaranteed availability, no preemptions

Calculate Your Savings

Use our GPU Cost Calculator to model on-demand vs reserved vs spot costs for your specific workload — GPU type, hours per month, and workload duration.

Frequently Asked Questions

When should I use spot GPU instances?

Use spot for fault-tolerant training jobs that checkpoint regularly, batch inference jobs, embedding generation, and any workload where a 30-second interruption is acceptable. Avoid spot for latency-sensitive inference endpoints, interactive development sessions, or jobs without checkpoint support.

When is reserved GPU pricing worth it?

Reserved pricing pays off when GPU utilization exceeds ~50% over the commitment period. If you're running training or inference continuously, reserved pricing beats on-demand by 20–35% for 1-year commitments. Break-even is roughly 4,000 hours of usage per year.

Can I mix pricing models?

Yes, and most production teams do. A common pattern: reserve baseline GPU capacity for inference, use on-demand for overflow, and submit training jobs to spot when available. Some providers allow converting on-demand instances to reserved mid-run.

What is the effective cost difference between spot and on-demand H100?

H100 on-demand at CoreWeave: $2.23/hr. H100 spot on Vast.ai: $1.49/hr. That is 33% savings. For AWS: on-demand ~$3.90/hr, spot pricing for equivalent capacity ~$1.80–2.20/hr (50–44% savings). Spot discounts vary by provider and real-time availability.

See live spot and reserved prices

Real-time pricing for all three models across 32+ providers.

GPU Spot Pricing Guide →

Reserved vs Spot vs On-Demand GPU:Which Saves More? (2026)

Side-by-Side Comparison (H100, CoreWeave, 2026)

On-Demand: Maximum Flexibility, Highest Cost

Reserved: Best Economics for Steady Workloads

Spot: Maximum Savings, Requires Engineering

Decision Framework

Calculate Your Savings

Frequently Asked Questions

See live spot and reserved prices

Reserved vs Spot vs On-Demand GPU:
Which Saves More? (2026)