How to Stop Re-Downloading Your Models Every Time You Spin Up a ComfyUI Cloud GPU

Back
Team Aquanode

Team Aquanode

Sarthak Vaish

JUNE 14, 2026

How to Stop Re-Downloading Your Models Every Time You Spin Up a ComfyUI Cloud GPU

Picture the version of this where it just works. You rent a GPU, any GPU, on whatever provider is cheapest or actually has stock today. You open ComfyUI. Your 80GB of checkpoints, your LoRAs, your VAE, your upscalers, your custom nodes pinned to the exact commits that don't throw red errors, your saved workflows, all of it is already sitting there at the same paths. No Hugging Face downloads. No Civitai login. No "guess the folder again." You load a workflow and generate inside two minutes. That's the whole post. Everything below is how people try to get there on a stateless cloud GPU, where each attempt quietly breaks, and the one approach that holds when you change providers.

I dug into r/comfyui, the RunPod docs, and a stack of community tutorials, and the same thing keeps showing up: people on cloud GPUs burn a real chunk of every session just re-downloading the models they already downloaded last week. One writeup put a number on it that stuck with me.

"The 25 minutes you spend re-downloading a model because you didn't persist the right directory." Source: Medium/@velinxs, "Vast.ai vs RunPod pricing in 2026"

TL;DR: ComfyUI on a stateless cloud GPU wipes your models the moment the pod terminates, so most people re-download 20-100GB every session. The community fix is a network volume plus symlinks, which survives a restart but stays locked to one provider and one datacenter region. To actually keep your models between sessions and across providers, you need a snapshot that captures the model files themselves and restores them at their original paths on whatever GPU you spin up next.

Why this keeps happening: the model library got huge and the GPU stays stateless

A modern ComfyUI setup is not light anymore. It stopped being light around the time Flux landed. Here's the rough math on a normal-ish image-gen creator's library, not even a power user:

What's in your ComfyUI folderTypical size
One Flux base checkpoint (fp16)~22 GB
A couple of SDXL checkpoints~6-7 GB each
10-20 LoRAs~100-300 MB each, so 2-6 GB
VAEs~300 MB-1.5 GB
Upscale models (ESRGAN, etc.)~70-300 MB each
ControlNet / IP-Adapter models~1-3 GB each
CLIP / text encoders for Flux~5-10 GB

Add it up and a single working setup is 30GB on the low end and crosses 100GB fast once you keep a few video models around (WAN, Hunyuan, LTX each bring their own multi-GB weights). Now point that at a cloud GPU. Every cloud GPU is stateless by design. The local disk on a pod is ephemeral, and the providers say so plainly:

"By default, each pod has a container volume (the local disk attached to the pod) which is ephemeral. Once you stop or terminate the pod, that storage is wiped." Source: RunPod, GPU Infrastructure Playbook

So the mismatch is the whole problem. Your setup is heavy and stateful. The box you rent is light and forgetful. Terminate the pod and the 80GB you pulled down is gone, and next session you pull it down again. That is the re-download tax, and it's the part of a cheap GPU nobody puts on the price tag. The real cost of a $0.34/hr 4090 isn't $0.34. It's the 40 minutes re-downloading Flux and your LoRAs before the card does a single useful thing.

Below are the four things people actually do about it, roughly in the order they discover them, and exactly where each one stops working.

Step 1: Just re-download every time (and why it silently costs the most)

This is the default. You spin up a fresh pod, open a terminal, and start pulling. It's also what most tutorials quietly assume you'll do, because it needs zero setup.

It works. It's just expensive in the way that doesn't show up until you tally it. The Hugging Face and Civitai pull for a full library is bandwidth-bound, not compute-bound, so you are paying GPU-hour rates to a card that's sitting idle waiting on a download. On a metered pod that 25-to-40-minute pull is billed at the same rate as generation. Do that across a few sessions a week and the "cheap" GPU quietly became the expensive one.

The second cost is human. Piecemeal downloading means guessing folder locations again, getting one wrong, watching a workflow throw a missing-model error, and going back to move the file. The ComfyUI community documented this loop step by step:

"That's too many manual steps, which means it's slow, error-prone, and easy to forget when you come back a week later." Source: dev.to/promptingpixels, "One-command ComfyUI on Cloud GPUs"

Re-downloading is fine as a one-off. As a per-session ritual it's the thing this entire post exists to kill.

Step 2: Write a download script so at least it's automated

The natural next move, and a good one. Instead of pulling models by hand you write a setup script: a list of wget or hf download lines that pull your checkpoints, LoRAs, and VAE into the right ComfyUI directories, plus the git clone lines for your custom nodes. People share these as gists. There's a whole little genre of "one-command ComfyUI on cloud" tooling built exactly because the manual version is so painful.

This fixes the error-prone half of the problem. It does not fix the slow or the billed half. The script still downloads 80GB over the network every single time you run it, you're still paying GPU time while it runs, and you're still at the mercy of whether Hugging Face is fast today and whether that Civitai link still resolves. It also doesn't capture the part that breaks workflows: a git clone of a custom node pulls latest, not the commit that actually worked with your workflow, so a node that updated since last week can throw a wall of red on load. Automation makes the re-download repeatable. It doesn't make it stop.

Step 3: Attach a network volume and symlink your models into it

This is the real community fix, and it's the one most "keep your ComfyUI models on RunPod" tutorials land on. You create a persistent network volume, put your models and custom nodes on it, and symlink them into ComfyUI's directories so the app finds them where it expects:

ln -sf /runpod-volume/custom_nodes/* /comfyui/custom_nodes/
ln -sf /runpod-volume/models/checkpoints/* /comfyui/models/checkpoints/

Now the volume outlives the pod. Terminate the box, spin up a new one, re-attach the volume, and your files are there. The tutorials describe exactly this payoff:

"Your volume retains all ComfyUI files, models, and workflows, even after a pod is stopped or deleted, saving you from re-downloading and reconfiguring everything." Source: Next Diffusion, ComfyUI + RunPod Network Volume guide

This is genuinely better. For a single-provider, single-region workflow it solves the re-download tax. If you'd asked me "how do I keep my ComfyUI models between sessions on one provider," this is the answer. So use it.

Here's where it stops. A network volume is pinned to one provider and one datacenter region, and the provider docs say so directly:

"Volumes are region-specific. If you change GPU regions later (e.g., EU-RO to US-CA), you'll need to manually transfer your data to a new volume in that region." Source: RunPod, Network Volumes docs

So the volume keeps your models, but only as long as you stay on that provider, in that region. The moment a cheaper 4090 shows up in a different region, or the H100 you want for a video model is on a different provider entirely, or that region is simply out of stock, the volume can't follow you. You're back to manually transferring the whole library, which is re-downloading by another name. The persistence is real, it's just leashed to one datacenter. For an audience that switches providers specifically to chase the cheapest or available card, a fix that forbids switching is a half-fix.

Step 4: Pay a managed ComfyUI service to never set up at all

The other escape hatch is to stop renting raw GPUs and pay a managed ComfyUI host (RunComfy, ThinkDiffusion, Comfy Cloud) that ships a pre-built environment. Their whole pitch is the absence of this problem:

"Skip the dependencies, custom nodes, and model downloads. Open the link and run." Source: managed-ComfyUI marketing, quoted in use-apify.com

It's a real fix for setup. The catch is two-sided. One, you pay a premium for it, often an H100 around $4.49/hr when the same card is under $2 on a raw cloud, and you have zero portability off their platform. Two, and this is the subtle one for anyone who's been burned, their template runs their ComfyUI version with their node versions, which may not match your pinned working setup. One real creator paid for it until the cost crossed a line and walked:

"RunComfy is great to play with video models without needing a high-end system. But the price increase drove me to just buy a 5090 and cancel. Money spent on cloud fees is just gone; buying your own gets you an asset." Source: Reddit user, quoted in a RunComfy review, 2026

So managed hosting trades the re-download tax for a markup-plus-lock-in tax. You stop downloading models and start renting someone else's idea of your environment.

What actually works: snapshot the models themselves, restore them anywhere

Step back and notice what every option above gets wrong. Re-downloading, scripts, and the managed host all rebuild your model library from the internet each time. The network volume stores it but won't let it travel. The thing nobody in that list does is treat your actual files, the bytes already on the box, as the thing to capture and carry.

That's the approach we built Aquanode around. Instead of a region-locked volume or a fresh download, you take a snapshot of your running ComfyUI setup, and the snapshot captures the files: your models/ tree, your custom_nodes/ at their exact pinned commits, the venv, and your workflows. When you spin up a GPU on any of the 9 providers we run, the snapshot restores those files to their original paths. Your 22GB Flux checkpoint isn't re-downloaded from Hugging Face. It's restored from your snapshot onto the new box, and a custom node comes back at the commit that worked, not at latest.

The difference from a network volume is the part that matters for switching: the snapshot isn't anchored to a provider or a region. That's the leash this whole post has been about. Spin up a 4090 in one region today and an H100 on a different provider next week, restore the same snapshot to both, and the model library is just there at the same paths. No re-download, no manual transfer, no region transfer fee.

We validated this end-to-end on 2026-06-12. A real Pause then Resume of a ComfyUI deploy round-tripped the full /opt/ComfyUI environment, the install at its commit, the venv packages, the custom nodes at their commits, and a 2.13GB model checkpoint, restored bit-for-bit identical (SHA256-matched) on the new box. So this isn't a "trust us" claim. The bytes match.

One honest limit, because the audience reading this has been lied to by restore buttons before. Restore brings your environment back, models and nodes and all, to the new GPU. It does not auto-launch ComfyUI for you yet. You relaunch the app once after restore, then you're generating. You get your studio back without re-downloading it. You don't get an instant-on running server. That's the real boundary today, and I'd rather state it than have you find it.

What you should actually do

Pick by how far you need your models to travel.

  • You only ever use one provider, one region. Use a network volume with symlinks (Step 3). It genuinely solves the re-download tax inside that box. Don't over-engineer it.
  • You switch providers or regions to chase cheap or available GPUs. The volume can't follow you and managed hosting locks you in. You want a provider-agnostic snapshot that captures the model files and restores them at their original paths on whatever card you rent next, the way we describe in our breakdown of persistent cross-provider ComfyUI.
  • You mostly want to never set up again and don't mind a markup. A managed ComfyUI host is fine, just go in knowing the price premium and the version-drift risk. Our take on the managed-vs-raw tradeoff walks through the math.

Whichever you pick, the principle is the same: the goal is to stop paying GPU-hour rates to a card that's busy downloading files you already own. Keep the models, don't re-fetch them.

About the author

I'm on the team at Aquanode. I'm not a full-time ComfyUI artist, so this is written as someone who read a lot of r/comfyui and provider docs rather than someone claiming your exact workflow. What I can speak to is the persistence side: we built and validated a snapshot that captures a full ComfyUI environment, models included, and restores it byte-for-byte on any of 9 GPU providers. The sources below are where the pain quotes come from.

Sources

#comfyui#cloud gpu#models#stable diffusion#persistence
Ready when you are

Stop paying for
idle GPUs.

Sign up in 60 seconds. Pay only for the GPU minutes you actually use.

Aquanode LogoAquanode

© 2026 Aquanode. All rights reserved.

All trademarks, logos and brand names are the
property of their respective owners.