Live data — pricing records updated daily from 23+ providers
4.7x

The price spread on identical H100 hardware across cloud providers in April 2026. The cheapest H100 hour costs $1.49. The most expensive costs $6.98. Same GPU. Same specs. Different provider.

GPU cloud pricing in 2026 is a buyer's market — if you know where to look. The big hyperscalers (AWS, GCP, Azure) command a significant premium, while specialized GPU cloud providers offer the same hardware at 40–70% lower prices. The gap has narrowed slightly from 2025 as hyperscalers competed for AI workloads, but the spread remains enormous.

This article provides a complete per-GPU, per-provider cost-per-hour comparison based on GridStackHub's daily-updated pricing database of 266+ records across 23+ active providers.

H100 GPU Cost Per Hour by Provider (April 2026)

The NVIDIA H100 80GB SXM is the dominant GPU for LLM training and large-scale inference. Here is on-demand pricing ranked cheapest to most expensive:

Provider H100 SXM 80GB Pricing Type Tier Notes
Vast.ai $1.49/hr Spot market Budget Interruptible; varies by demand
RunPod $1.74/hr On-demand Budget Flexible billing, no commitment
Lambda Labs $1.99/hr On-demand Budget Strong uptime SLAs for GPU cloud
CoreWeave $2.29/hr On-demand Mid High-performance networking, HPC focus
Vultr $2.49/hr On-demand Mid Global regions, good latency
Paperspace $2.89/hr On-demand Mid ML-optimized tools included
AWS (p4d.24xlarge) $4.10/hr* On-demand Premium *per H100 equiv.; enterprise SLA, compliance
Google Cloud $4.65/hr* On-demand Premium *A3 instances; TPU alt. available
Azure (ND H100 v5) $6.98/hr* On-demand Premium *per-GPU equiv.; includes Azure SLA

*Hyperscaler per-GPU-equivalent pricing estimated from multi-GPU instance rates. Actual billing is per instance. Source: GridStackHub database, April 2026. Lowest-cost region shown.

A100 GPU Cost Per Hour (2026)

The NVIDIA A100 remains the workhorse for teams that do not need the H100's newest capabilities. It offers 80–90% of H100 training performance at significantly lower cost:

Provider A100 80GB SXM4 A100 40GB Pricing Type
Lambda Labs $1.29/hr $0.89/hr On-demand
RunPod $1.35/hr $0.95/hr On-demand
CoreWeave $1.59/hr $1.12/hr On-demand
Paperspace $1.89/hr $1.29/hr On-demand
AWS (p3/p4) $3.06/hr* $2.40/hr* On-demand
Google Cloud (A2) $3.18/hr* $2.55/hr* On-demand

L40S, A10G, and T4: Cost Per Hour for Inference Workloads

Training gets the attention, but inference is where most GPU spend actually goes at scale. These GPUs are purpose-built for cost-efficient serving:

GPU VRAM Price Range (On-Demand) Best Providers Ideal Workload
L40S 48 GB $0.89–$2.80/hr CoreWeave, Lambda, RunPod Inference, image gen, fine-tuning small models
A10G 24 GB $0.52–$1.28/hr RunPod, Vast.ai, Lambda 7B model serving, batch inference, NLP
RTX 4090 24 GB $0.39–$0.79/hr Vast.ai, RunPod Dev/test, light inference, small models
T4 16 GB $0.35–$0.76/hr GCP, AWS, Lambda Production inference at scale, cost-sensitive

Underrated pick: The L40S at $0.89/hr from CoreWeave delivers strong inference performance for models up to 34B parameters. For teams not needing H100/A100 training throughput, it cuts GPU costs by 40–55% vs. comparable A100 deployments.

H200 and B200: Next-Gen GPU Pricing

The H200 (141 GB HBM3e) and B200 (192 GB HBM3e) represent the current frontier of GPU hardware. Availability is limited and pricing reflects it:

GPU VRAM Current Range Availability Notes
H200 SXM 141GB 141 GB $2.89–$8.50/hr Limited (CoreWeave, Lambda, AWS) Frontier model training; 2× H100 memory
B200 192GB 192 GB $5.50–$15.00/hr Very limited (CoreWeave only in Q1 2026) Blackwell architecture; newest NVLink fabric

For most teams, H200 and B200 are not cost-effective in 2026. The use case is narrow: training models that genuinely require >80 GB of GPU memory per chip and where the 40–50% throughput improvement over H100 justifies the 2–3x price premium.

Reserved vs. On-Demand: How Much Do You Save?

Committing to reserved capacity for 1+ years typically cuts your per-hour cost by 25–45% depending on the provider:

GPU On-Demand (Best) 1-Year Reserved (Best) Savings Annual Delta
H100 SXM 80GB $1.74/hr $0.91/hr 48% $7,277/yr
A100 80GB $1.29/hr $0.78/hr 40% $4,468/yr
L40S 48GB $0.89/hr $0.56/hr 37% $2,891/yr

Annual delta = cost savings per GPU for 24/7 usage. An 8×H100 cluster on 1-year reserved vs. on-demand saves $58,216/year at the cheapest provider — before accounting for the hyperscaler vs. specialist gap.

What Drives GPU Price Differences in 2026?

The same GPU hardware can cost 4.7x more at one provider vs. another. Here is why:

  • Data center electricity costs: Specialized GPU clouds often colocate in low-cost power regions (Virginia, Texas, Oregon). Hyperscalers spread costs across global infrastructure with higher overhead.
  • Capital amortization: CoreWeave, Lambda, and RunPod have purpose-built GPU infrastructure with different capex models than hyperscalers who also pay for general-purpose compute.
  • Support and SLA overhead: AWS, GCP, and Azure include enterprise support, 99.9%+ SLAs, compliance certifications (SOC 2, HIPAA, FedRAMP), and 24/7 account management in their pricing. You pay for these whether you use them or not.
  • Network egress pricing: Hyperscaler egress fees can add $0.05–$0.09/GB. For large model training with frequent checkpointing, this adds up fast.
  • Spot/preemption availability: Providers with significant GPU overcapacity (Vast.ai, RunPod) can offer spot pricing at 50–70% below on-demand.

How to Use This Data

Per-hour pricing is a starting point. Your actual decision should factor in:

  • Compliance requirements: HIPAA, SOC 2, FedRAMP may force you toward hyperscalers regardless of cost.
  • Egress volume: If you move >10 TB/month out of the cloud, calculate egress fees before choosing a provider.
  • GPU availability: Spot instances at the cheapest providers are often sold out. Verify availability before building a workload dependency on a specific provider.
  • Support needs: Production workloads with SLA requirements need providers with formal uptime guarantees and support contracts.

Use the GridStackHub Cost Calculator to see exact monthly cost estimates for your specific GPU type, count, region, and pricing model — with all 23+ providers compared side by side.

Bottom Line: Who Has the Cheapest GPUs in 2026?

For most AI/ML workloads without enterprise compliance requirements: Lambda Labs, RunPod, and CoreWeave consistently offer the best combination of price, reliability, and GPU availability. Vast.ai wins on raw price for workloads that can tolerate spot interruptions.

Hyperscalers (AWS, GCP, Azure) remain the right choice for regulated industries, teams already deeply integrated into their ecosystems, or workloads requiring global edge deployment. The 3–4x price premium buys enterprise infrastructure — but if you do not need it, you are leaving real money on the table.