Cheapest A100 80GB available today — 89% cheaper than AWS p4d equivalent ($3.67–$4.84/hr). Same NVIDIA A100 silicon, different supply chain. The gap has never been wider.
A100 Cloud Pricing — Live Table (April 2026)
GridStackHub tracks A100 pricing across 15+ cloud providers daily. The table below is pulled live from our database — sorted cheapest per GPU per hour. Both A100 80GB and A100 40GB records are shown where available.
| # | Provider | GPU | Per GPU/hr | Type | GPUs | Notes |
|---|---|---|---|---|---|---|
| Loading live A100 pricing… | ||||||
The A100 price floor is $0.42/hr and falling. Community GPU networks like Salad Cloud aggregate underutilized A100 capacity from data centers worldwide. Availability varies — for production workloads requiring SLA guarantees, $0.78–$1.50/hr on-demand options are more reliable. Use the GPU Cost Calculator to model your actual monthly spend.
Why A100 Is at Historic Lows in 2026
A100 prices have never been cheaper. Three converging forces drove the 2026 price floor:
- Blackwell displacement. NVIDIA's B200 and GB200 NVL72 racks shipped at scale in Q1 2026, pulling enterprise demand away from A100 and H100. Teams that need frontier performance moved up. Teams that need cost efficiency stayed on A100 — and the price gap widened to match.
- Hyperscaler inventory rotation. Meta, Microsoft, and Google began retiring A100 clusters in favor of B200 and their own custom silicon (Trainium 3, TPU v6). That used A100 supply flowed into the secondary market — dramatically expanding available capacity.
- Community GPU network maturity. Salad Cloud, Vast.ai, and similar platforms aggregated A100 instances from commercial data centers and research institutions. What was once $2–$3/hr on AWS can now be sourced from the same-generation silicon at $0.42–$0.78/hr through these networks.
The result: Salad Cloud at $0.42/hr, Thunder Compute at $0.78/hr, Vast.ai starting at $0.80/hr — versus AWS p4d.24xlarge at $3.67–$4.84/hr per A100. That is an 8–11× price gap for identical A100 silicon.
Price window note. A100 pricing at $0.42/hr is an artifact of supply temporarily exceeding demand. As older A100 inventory retires and enterprise demand normalizes around H100/B200, the floor will likely move back toward $0.75–$1.00/hr by 2027. If you have long-horizon A100 workloads, now is the time to lock in reserved pricing.
A100 80GB vs A100 40GB: Full Comparison
Two A100 variants exist. Understanding the difference prevents costly mistakes when provisioning at scale:
| Spec | A100 80GB SXM4 | A100 40GB PCIe | Winner |
|---|---|---|---|
| GPU Memory | 80 GB HBM2e | 40 GB HBM2 | 80GB +2× |
| Memory Bandwidth | 2,000 GB/s | 1,555 GB/s | 80GB +29% |
| TF32 Throughput | 312 TFLOPS | 312 TFLOPS | Tied |
| BF16/FP16 | 312 TFLOPS (sparse) | 312 TFLOPS (sparse) | Tied |
| NVLink bandwidth | 600 GB/s | 400 GB/s | 80GB +50% |
| Max model at BF16 | ~34B params | ~16B params | 80GB +2× |
| Max model at INT4 | ~160B params | ~80B params | 80GB +2× |
| Typical cloud price | $0.42–$3.67/hr | $0.125–$1.79/hr | 40GB (cheaper) |
| Cost per GB VRAM | $0.0053/GB (at $0.42) | $0.0031/GB (at $0.125) | 40GB (at floor) |
Which variant to choose
A100 80GB is the right choice for most teams:
- Running 7B–34B models in BF16 precision (single GPU inference)
- Fine-tuning 13B–34B models with full parameter updates
- High-throughput batched inference needing large KV-cache allocation
- Multi-GPU tensor parallelism jobs where NVLink bandwidth matters
A100 40GB makes sense when:
- Your model and batch size are confirmed to fit in 40GB
- You need maximum GPU density (e.g., 8× 40GB vs 4× 80GB per node)
- Running smaller models (7B at BF16, 13B at INT8) at maximum throughput per dollar
- Budget-constrained and Vultr's $0.125/hr 40GB is available for your workload
When A100 Beats H100 in 2026
The H100 is not always the best choice. For a significant class of workloads, the A100 at 2026 prices is strictly better on cost per output:
✓ LLM Inference (7B–34B)
vLLM and TGI run 7B–34B models efficiently on A100 80GB. At $0.78/hr vs H100's $1.74/hr minimum, the A100 saves 55% on inference costs without meaningful latency difference for most serving patterns.
✓ LoRA / QLoRA Fine-tuning
LoRA fine-tuning for 7B–13B models on A100 80GB is near-identical wall-clock time vs H100 at 1/3 the cost. For non-latency-sensitive training jobs, this is the best $/output available in 2026.
✓ Stable Diffusion / Image Gen
A100's 80GB VRAM enables large batch sizes for SDXL, Flux, and Stable Diffusion 3. With 2× the VRAM of an H100 at half the price, throughput per dollar is significantly higher.
✓ Batch Processing Jobs
Any workload that can run overnight (embeddings, document processing, batch scoring) benefits from A100's $0.42–$0.78/hr pricing. A 12-hour batch job costs $5–$9 on A100 vs $17–$58 on AWS H100.
⚡ H100 Better: Large-Scale Training
H100's FP8 tensor cores (3,958 TFLOPS vs A100's 312 TFLOPS TF32) provide genuine speedup for continuous pretraining at scale. The TCO math favors H100 when training > 10B tokens/week.
⚡ H100 Better: 70B+ Inference
For low-latency inference on 70B+ models (Llama 3 70B, Mixtral 8x22B), H100's higher memory bandwidth and NVLink throughput reduce TTFT and inter-token latency meaningfully at scale.
Rule of thumb: If your workload fits in 80GB and you're not constrained by training throughput or P99 inference latency, run it on A100 in 2026. The 3–10× cost difference compounds dramatically at scale.
Frequently Asked Questions
Track A100 Prices and Get Alerts
A100 pricing is moving fast in 2026. As Blackwell supply increases and older A100 inventory retires, prices will shift. Get ahead of it:
Get A100 price drop alerts
We'll notify you when A100 prices drop further, new providers list capacity, or a better deal appears. Free — no credit card required.
Model your exact A100 cost for your workload
Set your model size, hours per month, and precision — see exact monthly cost for every A100 provider in 60 seconds. Includes H100 and B200 comparison.
Open GPU Cost Calculator →