H100 Price History: A 45% Decline
The H100 pricing story from 2024 to 2026 is one of scarcity resolving. In mid-2024, H100 SXM5 demand massively outstripped supply. NVIDIA's production ramp was slower than market demand, and specialized cloud providers couldn't source enough GPUs to meet demand.
| Period | H100 On-Demand (CoreWeave) | H100 Spot (Best Available) | Supply Condition |
|---|---|---|---|
| Q1 2024 | ~$4.50/hr | ~$3.20/hr | Severe scarcity |
| Q3 2024 | ~$3.50/hr | ~$2.50/hr | Easing slightly |
| Q1 2025 | ~$2.75/hr | ~$1.90/hr | Balanced |
| Q3 2025 | ~$2.39/hr | ~$1.65/hr | Slight oversupply |
| Q1 2026 | ~$2.23/hr | ~$1.49/hr | Competitive market |
The 45% decline from peak represents normalization, not collapse. H100 GPUs remain in high demand for inference at scale — the decline reflects supply catching up, not demand disappearing.
B200 Market Entry: Impact on H100 Economics
NVIDIA's Blackwell architecture (B200, GB200 NVL72) began reaching cloud providers in late 2025 at meaningful scale. The B200 delivers approximately 2.5× the inference throughput of H100 per GPU, at a price premium of ~2.2–2.4× per GPU hour.
The result: B200 offers roughly equivalent cost-per-token to H100 for inference workloads, with dramatically better latency. For training, B200 is clearly superior in cost efficiency at scale. This has shifted high-end training workloads to B200, freeing H100 supply for inference — applying further downward pressure on H100 prices.
Key Market Dynamics
- Inference demand surge: The rise of AI-native products (GPT wrappers, coding assistants, agent frameworks) has created massive inference demand that partially offsets training workload migration to B200.
- Spot market maturation: Vast.ai, RunPod, and similar GPU marketplaces have matured significantly. Spot prices have become a reliable floor for H100 pricing, pulling on-demand prices downward.
- Hyperscaler stickiness: AWS, Azure, and GCP have maintained higher prices (2.5–3× specialized providers) due to compliance, integration, and enterprise relationship lock-in. This gap is widening, not narrowing.
- Regional supply variation: European GPU availability has tightened while US supply normalized. EU providers (OVH, Hetzner, Scaleway) continue commanding 10–20% premiums vs US equivalents.
2026 Pricing Outlook
Based on supply pipeline and demand signals tracked through GridStackHub:
- H100: Prices will stabilize in the $1.75–2.25/hr range. Further major declines are unlikely as deployment for inference absorbs the supply glut. Spot prices could dip to $1.20–1.40/hr in periods of low demand.
- H200: Modest price declines as supply increases. Expect $3.50–4.50/hr on-demand through end of 2026.
- B200: Prices will fall as supply ramps. Expect $4.00–5.00/hr by end of 2026, down from current $5.29/hr floor.
- A100: Continued decline toward commodity pricing. $0.80–1.20/hr on-demand by end of 2026 for most providers.
Track live price trends: GPU Cost Pulse — Weekly Analysis →
Frequently Asked Questions
Get weekly GPU pricing updates
Subscribe to GPU Cost Pulse — our weekly analysis of pricing trends and market moves.
Read GPU Cost Pulse →