GPU compute gets all the attention — but it's only part of the picture. Understanding the full AI infrastructure cost stack helps you optimize the right things and avoid surprises on your cloud bill.
GPU Compute: 60–80% of Total Cost
GPU compute is the dominant cost for almost every AI workload. The wide range reflects workload type:
- Training-heavy workloads: GPU is 75–85% of total cost. You're running GPUs continuously, storage and network are secondary.
- Inference at scale: GPU drops to 60–70% when serving millions of requests. Egress and networking grow as a share.
- Data pipeline workloads: GPU may be 50–60% with significant storage and CPU costs.
See live GPU prices: GPU Pricing Comparison →
Use our AI model cost breakdown for specific model cost estimates.
Networking and Egress: 5–15% of Total Cost
Egress (data leaving a cloud region) is the most commonly underestimated cost in AI infrastructure:
| Provider | Egress Rate | Notes |
|---|---|---|
| AWS | $0.09/GB after 100GB/mo | First 100GB free |
| Google Cloud | $0.08/GB (US) | Higher for inter-region |
| Azure | $0.087/GB | First 100GB free |
| CoreWeave | $0.05/GB | Lower than hyperscalers |
| Lambda Labs | Free up to 10TB/mo | Best for data-heavy workloads |
| Cloudflare R2 | $0.00/GB | Zero egress — popular for model weights |
For a 70B model serving 10,000 requests/day at 2KB per response, egress is ~20GB/day ($1.80/day on AWS). Multiply by 365: $657/year just in egress. Use zero-egress storage (R2, Backblaze) for model weights and training data where possible.
Storage: 10–20% of Total Cost
| Storage Type | Cost | Use Case |
|---|---|---|
| S3 / GCS (object) | $0.023/GB-month | Training data, model checkpoints |
| Cloudflare R2 | $0.015/GB-month | Model weights (zero egress) |
| NVMe local (cloud) | $0.10–0.25/GB-month | Active training data |
| EFS / NFS | $0.30/GB-month | Shared datasets across nodes |
Sizing tip: a 70B model in fp16 = ~140GB. In fp8 = ~70GB. Quantization halves your storage costs and often speeds up inference.
Energy Costs: Embedded in GPU Pricing
For cloud GPU users, energy costs are embedded in the provider's pricing — you don't pay electricity bills directly. But energy costs affect which providers are cheapest in which regions.
A single H100 draws approximately 700W (SXM5 spec). At a typical data center PUE of 1.3:
- Effective power draw per H100: ~910W
- Annual energy per H100 (24/7): ~7,970 kWh
- At Texas industrial rates ($0.045/kWh): ~$359/year electricity cost per GPU
- At California commercial rates ($0.18/kWh): ~$1,435/year per GPU
This 4× electricity cost differential explains why Texas and other low-cost power states attract GPU cloud buildouts. See: Best States for Data Centers 2026 →
Full Cost Example: 70B Model Inference Endpoint
| Component | Spec | Monthly Cost | % of Total |
|---|---|---|---|
| GPU compute | 2× H100 on CoreWeave (reserved) | $2,570 | 78% |
| Storage | 500GB (model weights + cache) | $12 | <1% |
| Networking | ~600GB/mo egress | $30 | 1% |
| CPU instances | Load balancer + management | $80 | 2% |
| Monitoring | Datadog / Prometheus | $60 | 2% |
| Total | $2,752/mo | 100% | |
Frequently Asked Questions
Calculate your full AI infrastructure cost
GPU, storage, egress — estimate total monthly costs across all providers.
Open Cost Calculator →