Beginner's guide to renting a GPU for AI
Updated April 2026 · Live prices refreshed less than a minute ago
You want to fine-tune Llama, train a LoRA, or run Stable Diffusion at scale, and your laptop GPU isn't going to cut it. Renting a GPU by the hour is the cheap, fast answer — but the landscape is messy. There are P2P marketplaces, decentralized networks, first-party clouds, hyperscalers, and inference-only platforms, and they don't compete on the same axes. This guide is the renter's-eye view: what to rent, where, and how to actually run your first job.
What "GPU rental" actually means
Three categories, in order of price (cheap → expensive):
- P2P marketplaces (Vast.ai, RunPod Community, TensorDock, Clore.ai). Independent hosts list cards. You bid or pick a fixed rate. Cheapest, but quality varies.
- Decentralized networks (io.net, Akash, Render). Crypto-incentivised compute pools. Cheap, often paid in their token. Onboarding has more friction.
- First-party clouds (Lambda, RunPod Secure, CoreWeave, Paperspace). One vendor owns the datacenter. More expensive but more predictable — better for production workloads.
For most experimentation work — fine-tuning, training, batch inference, hobbyist projects — start at the P2P marketplace level. You can always graduate.
Which GPU do I actually need?
Match the GPU's VRAM to your model's footprint. As rough thresholds:
- Stable Diffusion 1.5 / SDXL inference, 7B LLM inference (4-bit) — RTX 3090 / 4090 (24 GB). $0.20–0.50/h on Vast.ai.
- QLoRA fine-tuning of 7B–13B models, SDXL training — A6000 / RTX 6000 Ada (48 GB). $0.50–1.50/h.
- 13B–34B fine-tuning, multi-batch inference — A100 (80 GB). $0.80–2.50/h on marketplaces, $1.50–4/h on first-party clouds.
- 70B fine-tuning, full-precision training — H100 (80 GB) or H200 (141 GB). $1.80–4.50/h marketplace, $3–7/h cloud.
Live cheapest prices for each: the cheapest-now table.
Right now, the cheapest cards are
Marketplace vs cloud: how to pick
Marketplace (Vast / RunPod Community / TensorDock): use it when you're experimenting, can tolerate occasional preemption, and want the lowest hourly rate. You'll often pay 30–60% less than a first-party cloud for the same GPU. Trade-offs: hosts vary in quality, network speeds vary, and "interruptible" tier means your job can get yanked mid-run.
Cloud (Lambda / RunPod Secure): use it when uptime matters, you're running multi-day training, or you need a specific datacenter region. You pay the premium for predictability.
Your first rented job, end-to-end
Walking through the simplest path on Vast.ai or RunPod:
- Pick a GPU on the cheapest-now table or a per-GPU page. Note the provider it's cheapest on.
- Sign up to that provider, add $10–20 of credit. Most accept card; some take crypto.
- Pick a Docker image. PyTorch 2.x with CUDA 12 is the safe default. Most providers ship "PyTorch + Jupyter" templates — start there.
- Set a disk size. 50 GB is enough for a Llama 7B run. Bigger if you're loading a dataset.
- Launch. SSH in or open the Jupyter URL the provider gives you.
- Run your job. Stop the instance when done — billing keeps ticking until you stop it.
Cost-tracking discipline
Hourly billing creeps up fast. A 4090 at $0.30/h running for a forgotten weekend = $14. An H100 at $2/h forgotten for the same window = $96. Two habits that save real money:
- Set a hard budget cap in the provider's dashboard. Vast and RunPod both support per-instance auto-shutdown.
- Use spot / interruptible tiers for resumable workloads. Save your training checkpoints every N steps so a preemption is a 2-minute hit, not a 4-hour loss.
When to switch to a different provider
Marketplaces are commodity-priced — rates drift hourly. The X vs Y comparisons are useful when one platform's host fee makes the same GPU 10–20% more expensive on the renter side. If you're running >100 hours/month on one platform, that 15% gap is a real budget line.