Hello,
I use ComfyUI. My graphics card model is NVIDIA GeForce RTX 5060 Ti with 8GB VRAM. Which model can I use to convert photos to videos?
Thank you.
Hello,
I use ComfyUI. My graphics card model is NVIDIA GeForce RTX 5060 Ti with 8GB VRAM. Which model can I use to convert photos to videos?
Thank you.
Try Wan…
Hello,
Thank you so much for your reply.
1- Is Wan’s model a photo of a man playing the guitar?
2- In the video you introduced, the model requires Pytorch 2.7, but ComfyUI uses a higher version of Pytorch. Doesn’t this cause interference?
Hello,
I followed the steps at How to Run Wan2.2 Image to Video GGUF Models in ComfyUI (Low VRAM) - Next Diffusion , but I got:
Any idea?
I think it’s just a VRAM shortage. The cause is overuse. You might be able to avoid it in the settings.
(Especially if you’re using Windows,) other programs besides ComfyUI might be consuming VRAM, so keep an eye on those too. Even web browsers can sometimes consume a bit of VRAM as well…
Cause: your GPU ran out of VRAM. The error comes from the sampler step where the UNet runs over the full latent video tensor. In video, memory scales with width Ă— height Ă— frames Ă— batch. A small increase in any of those can OOM an 8 GB card. (GitHub)
Use the GGUF UNet, not FP16
In your workflow, swap any “Load Diffusion Model” node for Unet Loader (GGUF) from the ComfyUI-GGUF node. Point it at Wan2.2-TI2V-5B-*.gguf in ComfyUI/models/unet/. If the FP16 .safetensors UNet stays wired in, you will OOM. (GitHub)
Install a quantized 5B build
Grab the Wan2.2-TI2V-5B-GGUF package. Quantized GGUF variants reduce VRAM at a small quality cost. Place UNet in models/unet, VAE in models/vae, and UMT5 text encoder in models/text_encoders. (Hugging Face)
Start with a tiny workload
Set the latent node to ~672Ă—384 and 33 frames @ 24 fps, batch = 1 in both the latent/video node and KSampler. Increase later. The official Wan 2.2 page shows where to change length (frames) and confirms the 5B fits on 8 GB with native offloading. (ComfyUI)
Turn on low-VRAM runtime knobs
In ComfyUI Settings → Server config: set VRAM management mode to auto or lowvram. Consider lowering reserved VRAM if you set it high. These controls exist to prevent OOM. (ComfyUI)
Keep extras off the GPU
Text encoder and VAE can stay on CPU if VRAM is tight. The official template already uses offloading; use it as your base. (ComfyUI)
About “Lightning” speedups
LightX2V Wan2.2-Lightning currently documents 4-step distillation for the A14B models. TI2V-5B 4-step support is listed as “Todo,” so don’t expect a working 5B Lightning LoRA yet. (Hugging Face)
If you still OOM
Reduce width/height first, then frames, then steps. Double-check the UNet loader is the GGUF node, not FP16. The KSampler “Allocation on device” message is a straight VRAM-exceeded signal. (GitHub)
length and sizes. (ComfyUI)highvram. (ComfyUI)length=33. Then run. (ComfyUI)Core setup
Troubleshooting
Speed options and limits
Hello,
Thank you so much for your reply.
I changed the resolution to 768x768 and the problem was fixed, but the output video is only 2 seconds. If I use the Wan2.2-TI2V-5B-GGUF model, is it possible to produce a video with a longer time?