The price spread on identical H100 hardware across cloud providers in April 2026. The cheapest H100 hour costs $1.49. The most expensive costs $6.98. Same GPU. Same specs. Different provider.
GPU cloud pricing in 2026 is a buyer's market — if you know where to look. The big hyperscalers (AWS, GCP, Azure) command a significant premium, while specialized GPU cloud providers offer the same hardware at 40–70% lower prices. The gap has narrowed slightly from 2025 as hyperscalers competed for AI workloads, but the spread remains enormous.
This article provides a complete per-GPU, per-provider cost-per-hour comparison based on GridStackHub's daily-updated pricing database of 266+ records across 23+ active providers.
H100 GPU Cost Per Hour by Provider (April 2026)
The NVIDIA H100 80GB SXM is the dominant GPU for LLM training and large-scale inference. Here is on-demand pricing ranked cheapest to most expensive:
| Provider | H100 SXM 80GB | Pricing Type | Tier | Notes |
|---|---|---|---|---|
| Vast.ai | $1.49/hr | Spot market | Budget | Interruptible; varies by demand |
| RunPod | $1.74/hr | On-demand | Budget | Flexible billing, no commitment |
| Lambda Labs | $1.99/hr | On-demand | Budget | Strong uptime SLAs for GPU cloud |
| CoreWeave | $2.29/hr | On-demand | Mid | High-performance networking, HPC focus |
| Vultr | $2.49/hr | On-demand | Mid | Global regions, good latency |
| Paperspace | $2.89/hr | On-demand | Mid | ML-optimized tools included |
| AWS (p4d.24xlarge) | $4.10/hr* | On-demand | Premium | *per H100 equiv.; enterprise SLA, compliance |
| Google Cloud | $4.65/hr* | On-demand | Premium | *A3 instances; TPU alt. available |
| Azure (ND H100 v5) | $6.98/hr* | On-demand | Premium | *per-GPU equiv.; includes Azure SLA |
*Hyperscaler per-GPU-equivalent pricing estimated from multi-GPU instance rates. Actual billing is per instance. Source: GridStackHub database, April 2026. Lowest-cost region shown.
A100 GPU Cost Per Hour (2026)
The NVIDIA A100 remains the workhorse for teams that do not need the H100's newest capabilities. It offers 80–90% of H100 training performance at significantly lower cost:
| Provider | A100 80GB SXM4 | A100 40GB | Pricing Type |
|---|---|---|---|
| Lambda Labs | $1.29/hr | $0.89/hr | On-demand |
| RunPod | $1.35/hr | $0.95/hr | On-demand |
| CoreWeave | $1.59/hr | $1.12/hr | On-demand |
| Paperspace | $1.89/hr | $1.29/hr | On-demand |
| AWS (p3/p4) | $3.06/hr* | $2.40/hr* | On-demand |
| Google Cloud (A2) | $3.18/hr* | $2.55/hr* | On-demand |
L40S, A10G, and T4: Cost Per Hour for Inference Workloads
Training gets the attention, but inference is where most GPU spend actually goes at scale. These GPUs are purpose-built for cost-efficient serving:
| GPU | VRAM | Price Range (On-Demand) | Best Providers | Ideal Workload |
|---|---|---|---|---|
| L40S | 48 GB | $0.89–$2.80/hr | CoreWeave, Lambda, RunPod | Inference, image gen, fine-tuning small models |
| A10G | 24 GB | $0.52–$1.28/hr | RunPod, Vast.ai, Lambda | 7B model serving, batch inference, NLP |
| RTX 4090 | 24 GB | $0.39–$0.79/hr | Vast.ai, RunPod | Dev/test, light inference, small models |
| T4 | 16 GB | $0.35–$0.76/hr | GCP, AWS, Lambda | Production inference at scale, cost-sensitive |
Underrated pick: The L40S at $0.89/hr from CoreWeave delivers strong inference performance for models up to 34B parameters. For teams not needing H100/A100 training throughput, it cuts GPU costs by 40–55% vs. comparable A100 deployments.
H200 and B200: Next-Gen GPU Pricing
The H200 (141 GB HBM3e) and B200 (192 GB HBM3e) represent the current frontier of GPU hardware. Availability is limited and pricing reflects it:
| GPU | VRAM | Current Range | Availability | Notes |
|---|---|---|---|---|
| H200 SXM 141GB | 141 GB | $2.89–$8.50/hr | Limited (CoreWeave, Lambda, AWS) | Frontier model training; 2× H100 memory |
| B200 192GB | 192 GB | $5.50–$15.00/hr | Very limited (CoreWeave only in Q1 2026) | Blackwell architecture; newest NVLink fabric |
For most teams, H200 and B200 are not cost-effective in 2026. The use case is narrow: training models that genuinely require >80 GB of GPU memory per chip and where the 40–50% throughput improvement over H100 justifies the 2–3x price premium.
Reserved vs. On-Demand: How Much Do You Save?
Committing to reserved capacity for 1+ years typically cuts your per-hour cost by 25–45% depending on the provider:
| GPU | On-Demand (Best) | 1-Year Reserved (Best) | Savings | Annual Delta |
|---|---|---|---|---|
| H100 SXM 80GB | $1.74/hr | $0.91/hr | 48% | $7,277/yr |
| A100 80GB | $1.29/hr | $0.78/hr | 40% | $4,468/yr |
| L40S 48GB | $0.89/hr | $0.56/hr | 37% | $2,891/yr |
Annual delta = cost savings per GPU for 24/7 usage. An 8×H100 cluster on 1-year reserved vs. on-demand saves $58,216/year at the cheapest provider — before accounting for the hyperscaler vs. specialist gap.
What Drives GPU Price Differences in 2026?
The same GPU hardware can cost 4.7x more at one provider vs. another. Here is why:
- Data center electricity costs: Specialized GPU clouds often colocate in low-cost power regions (Virginia, Texas, Oregon). Hyperscalers spread costs across global infrastructure with higher overhead.
- Capital amortization: CoreWeave, Lambda, and RunPod have purpose-built GPU infrastructure with different capex models than hyperscalers who also pay for general-purpose compute.
- Support and SLA overhead: AWS, GCP, and Azure include enterprise support, 99.9%+ SLAs, compliance certifications (SOC 2, HIPAA, FedRAMP), and 24/7 account management in their pricing. You pay for these whether you use them or not.
- Network egress pricing: Hyperscaler egress fees can add $0.05–$0.09/GB. For large model training with frequent checkpointing, this adds up fast.
- Spot/preemption availability: Providers with significant GPU overcapacity (Vast.ai, RunPod) can offer spot pricing at 50–70% below on-demand.
How to Use This Data
Per-hour pricing is a starting point. Your actual decision should factor in:
- Compliance requirements: HIPAA, SOC 2, FedRAMP may force you toward hyperscalers regardless of cost.
- Egress volume: If you move >10 TB/month out of the cloud, calculate egress fees before choosing a provider.
- GPU availability: Spot instances at the cheapest providers are often sold out. Verify availability before building a workload dependency on a specific provider.
- Support needs: Production workloads with SLA requirements need providers with formal uptime guarantees and support contracts.
Use the GridStackHub Cost Calculator to see exact monthly cost estimates for your specific GPU type, count, region, and pricing model — with all 23+ providers compared side by side.
Bottom Line: Who Has the Cheapest GPUs in 2026?
For most AI/ML workloads without enterprise compliance requirements: Lambda Labs, RunPod, and CoreWeave consistently offer the best combination of price, reliability, and GPU availability. Vast.ai wins on raw price for workloads that can tolerate spot interruptions.
Hyperscalers (AWS, GCP, Azure) remain the right choice for regulated industries, teams already deeply integrated into their ecosystems, or workloads requiring global edge deployment. The 3–4x price premium buys enterprise infrastructure — but if you do not need it, you are leaving real money on the table.