Avoiding the issue itself is easy, but Rust is throwing errors…?
I also encountered errors when using an old version of Candle before, so this might be a similar case. Just in case, I’ll ping @jsulz
Yes to both.
1) Download WAN manually
Use the Hugging Face tools. Bypass the flaky Xet path if needed. As of 2025-10-11 these methods are current.
A) CLI with file filters
pip install -U "huggingface_hub[cli]" hf_transfer
# Option 1: bypass Xet entirely
export HF_HUB_DISABLE_XET=1
# Option 2: try Rust downloader (faster on good links)
# export HF_HUB_ENABLE_HF_TRANSFER=1
# TI2V-5B (720p) essentials
hf download Wan-AI/Wan2.2-TI2V-5B-Diffusers \
--include "diffusion_pytorch_model-*.safetensors" \
"diffusion_pytorch_model.safetensors.index.json" \
"Wan2.2_VAE.pth" "models_t5_umt5-xxl-enc-bf16.pth" \
"config.json" "configuration.json" \
--local-dir ./Wan2.2-TI2V-5B
CLI supports --include/--exclude, --local-dir, and works with the env vars above. (Hugging Face)
B) Python API (snapshot_download) with patterns
import os
os.environ["HF_HUB_DISABLE_XET"]="1" # or try HF_HUB_ENABLE_HF_TRANSFER=1
from huggingface_hub import snapshot_download
snapshot_download(
repo_id="Wan-AI/Wan2.2-T2V-A14B-Diffusers",
local_dir="./Wan2.2-T2V-A14B-Diffusers",
allow_patterns=[
"transformer/diffusion_pytorch_model-*.safetensors",
"transformer/diffusion_pytorch_model.safetensors.index.json",
"vae/*","text_encoder/*","tokenizer/*","scheduler/*","model_index.json",
],
)
allow_patterns avoids downloading the whole repo; env vars control the transport backend. (Hugging Face)
C) Pick the right WAN repo
- A14B T2V Diffusers (large, full Diffusers layout). Model card updated 2025-08-25. Use includes like in B. (Hugging Face)
- TI2V-5B Diffusers (smaller, 720p on a 4090). Model card updated 2025-08-25. Includes ModelScope links. (Hugging Face)
D) If your network blocks HF
The TI2V-5B and A14B cards expose ModelScope mirrors; use their commands when HF is rate-limited or Xet is blocked. (Hugging Face)
2) Modules that can “do the WAN job”
Pick by task first: text-to-video vs image-to-video. Then match VRAM and toolchain.
Text-to-Video (open, maintained)
- CogVideoX (2B/5B). Official Diffusers pipelines and training docs. Good prompt adherence. Active in 2024–2025. (Hugging Face)
- Mochi-1 preview. Apache-2.0. Runs via Diffusers. Strong motion. Community shows 161-frame runs on 24 GB with recent Diffusers. (Hugging Face)
- Open-Sora v2. Research-grade, heavier setup. Provides t2i2v pipeline examples. (Hugging Face)
Image-to-Video (stable baseline)
- Stable Video Diffusion (SVD, XT/XT-1.1). Solid i2v in Diffusers. Good swap-in for ComfyUI or Python workflows. (Hugging Face)
Research directions
- Pyramidal / temporal-pyramid flow models. Useful if you need newer architectures or plan to finetune. (arXiv)
Minimal code examples
CogVideoX (Diffusers)
# docs: https://huggingface.co/docs/diffusers/en/api/pipelines/cogvideox
from diffusers import CogVideoXPipeline
import torch
pipe = CogVideoXPipeline.from_pretrained("THUDM/CogVideoX-5b", torch_dtype=torch.bfloat16).to("cuda")
video = pipe("A tabby cat astronaut slowly walks on the moon, cinematic, 24 fps", num_frames=49, height=480, width=720).frames
(Hugging Face)
Mochi-1 preview (Diffusers)
# docs: https://huggingface.co/docs/diffusers/en/api/pipelines/mochi
from diffusers import MochiPipeline
pipe = MochiPipeline.from_pretrained("genmo/mochi-1-preview", torch_dtype=torch.bfloat16).to("cuda")
out = pipe("Aerial drone shot over snowy mountains at golden hour", num_frames=49, height=480, width=832)
(Hugging Face)
SVD i2v
# model card: https://huggingface.co/stabilityai/stable-video-diffusion-img2vid
from diffusers import StableVideoDiffusionPipeline
pipe = StableVideoDiffusionPipeline.from_pretrained(
"stabilityai/stable-video-diffusion-img2vid", torch_dtype=torch.float16
).to("cuda")
(Hugging Face)
Quick selector
- Need T2V with open weights and Diffusers: start with CogVideoX-5B, then try Mochi-1 preview if you prefer Apache-2.0. (Hugging Face)
- Need I2V fast and simple: SVD. (Hugging Face)
- Want WAN-like quality with open stack and can afford setup: Open-Sora v2. (Hugging Face)