Trouble installing Longcat

onimuska · November 30, 2025, 5:34am

i am having a LOT of trouble with this install of longcat. i have solved a LOT of dependency issues on the instal, but i just can’t get this one. please help

Installing collected packages: mpmath, typing-extensions, sympy, pillow, numpy, networkx, MarkupSafe, fsspec, filelock, jinja2, torch, torchvision, torchaudio
Successfully installed MarkupSafe-2.1.5 filelock-3.19.1 fsspec-2025.9.0 jinja2-3.1.6 mpmath-1.3.0 networkx-3.3 numpy-2.1.2 pillow-11.3.0 sympy-1.13.1 torch-2.6.0+cu124 torchaudio-2.6.0+cu124 torchvision-0.21.0+cu124 typing-extensions-4.15.0
Collecting ninja
Using cached ninja-1.13.0-py3-none-win_amd64.whl.metadata (5.1 kB)
Using cached ninja-1.13.0-py3-none-win_amd64.whl (309 kB)
Installing collected packages: ninja
Successfully installed ninja-1.13.0
Collecting psutil
Using cached psutil-7.1.3-cp37-abi3-win_amd64.whl.metadata (23 kB)
Using cached psutil-7.1.3-cp37-abi3-win_amd64.whl (247 kB)
Installing collected packages: psutil
Successfully installed psutil-7.1.3
Collecting packaging
Using cached packaging-25.0-py3-none-any.whl.metadata (3.3 kB)
Using cached packaging-25.0-py3-none-any.whl (66 kB)
Installing collected packages: packaging
Successfully installed packaging-25.0
Collecting flash_attn==2.7.4.post1
Using cached flash_attn-2.7.4.post1.tar.gz (6.0 MB)
Installing build dependencies … done
Getting requirements to build wheel … error
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> [17 lines of output]
Traceback (most recent call last):
File “C:\Users\Owner\anaconda3\envs\longcat-video\lib\site-packages\pip_vendor\pyproject_hooks_in_process_in_process.py”, line 389, in
main()
File “C:\Users\Owner\anaconda3\envs\longcat-video\lib\site-packages\pip_vendor\pyproject_hooks_in_process_in_process.py”, line 373, in main
json_out[“return_val”] = hook(**hook_input[“kwargs”])
File “C:\Users\Owner\anaconda3\envs\longcat-video\lib\site-packages\pip_vendor\pyproject_hooks_in_process_in_process.py”, line 143, in get_requires_for_build_wheel
return hook(config_settings)
File “C:\Users\Owner\AppData\Local\Temp\pip-build-env-gqm7d9al\overlay\Lib\site-packages\setuptools\build_meta.py”, line 331, in get_requires_for_build_wheel
return self._get_build_requires(config_settings, requirements=)
File “C:\Users\Owner\AppData\Local\Temp\pip-build-env-gqm7d9al\overlay\Lib\site-packages\setuptools\build_meta.py”, line 301, in _get_build_requires
self.run_setup()
File “C:\Users\Owner\AppData\Local\Temp\pip-build-env-gqm7d9al\overlay\Lib\site-packages\setuptools\build_meta.py”, line 512, in run_setup
super().run_setup(setup_script=setup_script)
File “C:\Users\Owner\AppData\Local\Temp\pip-build-env-gqm7d9al\overlay\Lib\site-packages\setuptools\build_meta.py”, line 317, in run_setup
exec(code, locals())
File “”, line 22, in
ModuleNotFoundError: No module named ‘torch’
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed to build ‘flash_attn’ when getting requirements to build wheel
Requirement already satisfied: torch==2.6.0 in c:\users\owner\anaconda3\envs\longcat-video\lib\site-packages (from -r requirements.txt (line 1)) (2.6.0+cu124)
Collecting numpy==1.26.4 (from -r requirements.txt (line 2))
Downloading numpy-1.26.4-cp310-cp310-win_amd64.whl.metadata (61 kB)
Collecting transformers==4.41.0 (from -r requirements.txt (line 3))
Using cached transformers-4.41.0-py3-none-any.whl.metadata (43 kB)
Collecting loguru==0.7.2 (from -r requirements.txt (line 4))
Using cached loguru-0.7.2-py3-none-any.whl.metadata (23 kB)
Collecting diffusers==0.35.1 (from -r requirements.txt (line 5))
Using cached diffusers-0.35.1-py3-none-any.whl.metadata (20 kB)
Collecting einops==0.8.0 (from -r requirements.txt (line 6))
Using cached einops-0.8.0-py3-none-any.whl.metadata (12 kB)
Collecting ftfy==6.2.0 (from -r requirements.txt (line 7))
Using cached ftfy-6.2.0-py3-none-any.whl.metadata (7.3 kB)
Collecting psutil==6.0.0 (from -r requirements.txt (line 8))
Using cached psutil-6.0.0-cp37-abi3-win_amd64.whl.metadata (22 kB)
Collecting av==12.0.0 (from -r requirements.txt (line 9))
Downloading av-12.0.0-cp310-cp310-win_amd64.whl.metadata (4.8 kB)
Collecting opencv-python==4.9.0.80 (from -r requirements.txt (line 10))
Using cached opencv_python-4.9.0.80-cp37-abi3-win_amd64.whl.metadata (20 kB)
Collecting flash-attn==2.7.4.post1 (from -r requirements.txt (line 11))
Using cached flash_attn-2.7.4.post1.tar.gz (6.0 MB)
Installing build dependencies … done
Getting requirements to build wheel … error
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> [17 lines of output]
Traceback (most recent call last):
File “C:\Users\Owner\anaconda3\envs\longcat-video\lib\site-packages\pip_vendor\pyproject_hooks_in_process_in_process.py”, line 389, in
main()
File “C:\Users\Owner\anaconda3\envs\longcat-video\lib\site-packages\pip_vendor\pyproject_hooks_in_process_in_process.py”, line 373, in main
json_out[“return_val”] = hook(**hook_input[“kwargs”])
File “C:\Users\Owner\anaconda3\envs\longcat-video\lib\site-packages\pip_vendor\pyproject_hooks_in_process_in_process.py”, line 143, in get_requires_for_build_wheel
return hook(config_settings)
File “C:\Users\Owner\AppData\Local\Temp\pip-build-env-lkz3y_x0\overlay\Lib\site-packages\setuptools\build_meta.py”, line 331, in get_requires_for_build_wheel
return self._get_build_requires(config_settings, requirements=)
File “C:\Users\Owner\AppData\Local\Temp\pip-build-env-lkz3y_x0\overlay\Lib\site-packages\setuptools\build_meta.py”, line 301, in _get_build_requires
self.run_setup()
File “C:\Users\Owner\AppData\Local\Temp\pip-build-env-lkz3y_x0\overlay\Lib\site-packages\setuptools\build_meta.py”, line 512, in run_setup
super().run_setup(setup_script=setup_script)
File “C:\Users\Owner\AppData\Local\Temp\pip-build-env-lkz3y_x0\overlay\Lib\site-packages\setuptools\build_meta.py”, line 317, in run_setup
exec(code, locals())
File “”, line 22, in
ModuleNotFoundError: No module named ‘torch’
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed to build ‘flash-attn’ when getting requirements to build wheel
(longcat-video)

any advice would be greatly appreciated.

John6666 · November 30, 2025, 2:18pm

To explain it fully would take a while, but to put it simply, installing Flash Attention can be quite a challenge depending on your environment.

I’ll break this into:

What exactly your error means
The real causes (there are two)
Concrete solutions you can choose from (with commands)

1. What your error actually means

You already have Torch:

Successfully installed ... torch-2.6.0+cu124 torchaudio-2.6.0+cu124 torchvision-0.21.0+cu124

But when installing flash_attn==2.7.4.post1 you get:

Installing build dependencies ... done
Getting requirements to build wheel ... error
...
ModuleNotFoundError: No module named 'torch'
ERROR: Failed to build 'flash_attn'

and again via requirements.txt:

Collecting flash-attn==2.7.4.post1 (from -r requirements.txt)
...
Getting requirements to build wheel ... error
ModuleNotFoundError: No module named 'torch'

Key points:

Torch is installed in your Conda env.
The error is thrown inside a temporary build environment that pip creates just for building flash_attn. That temporary env does not see your torch.
This is a known, common problem with FlashAttention: PyPI’s page explicitly says that if you see ModuleNotFoundError: No module named 'torch', it is “likely because of pypi’s installation isolation” and tells you to use pip install flash-attn --no-build-isolation. (PyPI)

On top of that, you are on Windows, and FlashAttention is a heavy CUDA/C++ extension that is mainly tested on Linux. A lot of people hit additional problems building it from source on Windows and solve it via prebuilt wheels or WSL2. (GitHub)

So:

Immediate cause: pip’s build isolation environment cannot import torch.
Structural cause: FlashAttention is a compiled CUDA extension with weak native Windows support.

2. Background: what pip is doing and why Torch “disappears”

Modern pip uses PEP 517 build isolation for packages that declare a pyproject.toml (FlashAttention does).

When you run:

pip install flash_attn==2.7.4.post1

pip does roughly:

Reads pyproject.toml, sees a build backend (e.g. setuptools) and a list of build requirements.
Creates a temporary virtual environment in a temp directory (like your pip-build-env-gqm7d9al).
Installs only build-time dependencies there (setuptools, wheel, maybe CUDA/compilers).
Runs the build backend inside that isolated env to:
- compute build requirements (get_requires_for_build_wheel),
- generate the wheel.

Inside this isolated build env:

Your torch==2.6.0+cu124 from longcat-video Conda env is not installed.
FlashAttention’s setup code does import torch to:
- check the Torch version,
- find CUDA libraries via Torch’s Python API,
- configure compilation flags.

Because torch is not installed in the isolated env, import torch fails →
ModuleNotFoundError: No module named 'torch'.

This exact pattern shows up in multiple issues on the FlashAttention repo and Stack Overflow, with identical logs (Installing build dependencies ... done, Getting requirements to build wheel ... error, then ModuleNotFoundError: No module named 'torch'). (GitHub)

The FlashAttention maintainers solved it by telling people to turn off build isolation:

pip install flash-attn --no-build-isolation

which forces pip to build using your real environment, where Torch is installed. (PyPI)

So the error message is misleading: Torch is installed, but not in the small private env pip uses for building.

3. Background: why FlashAttention is fussy, especially on Windows

FlashAttention is not a typical pure-Python library:

It provides custom fused CUDA kernels for attention, tightly tied to:
- specific PyTorch versions,
- specific CUDA versions,
- GPU architectures. (PyPI)
It is primarily developed and tested on Linux; Windows support exists but is patchy and depends on custom wheels or manual builds. (GitHub)

For Linux, the recommended path is:

Use the right PyTorch+CUDA combination (you already have Torch 2.6.0+cu124).
Install FlashAttention with pip install flash-attn --no-build-isolation. (PyPI)

For Windows:

Many users report that building from source is slow and fragile (hours of compile time, Visual Studio / NVCC configuration, etc.). Guides suggest:
- Using MSVC Build Tools, CUDA toolkit, Ninja,
- Running pip in a VS “x64 Native Tools” prompt,
- Setting MAX_JOBS and other environment variables. (Reddit)
Because of this, some people now publish pre-built Windows wheels just for FlashAttention:
- mjun0812/flash-attention-prebuild-wheels (Linux + Windows, including FlashAttention 2.7.4.post1 with PyTorch 2.5–2.9 and CUDA 12.4–13.0). (GitHub)
- flash_attn_windows by petermg on GitHub (wheels built for Windows, advertised on Reddit). (Reddit)

So building from source on Windows is possible, but often not worth the headache compared to installing a matching wheel.

4. LongCat-Video’s specific requirements

The official LongCat-Video README says to do exactly what you did: (GitHub)

# create conda environment
conda create -n longcat-video python=3.10
conda activate longcat-video

# install torch (configure according to your CUDA version)
pip install torch==2.6.0+cu124 torchvision==0.21.0+cu124 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu124

# install flash-attn-2
pip install ninja
pip install psutil
pip install packaging
pip install flash_attn==2.7.4.post1

# install other requirements
pip install -r requirements.txt

And then notes:

FlashAttention-2 is enabled in the model config by default; you can also change the model config (./weights/LongCat-Video/dit/config.json) to use FlashAttention-3 or xformers once installed. (GitHub)

So:

Your versions (Python 3.10 + torch 2.6.0+cu124 + flash_attn 2.7.4.post1) are exactly what LongCat expects.
The only reason you fail is the packaging / platform story around flash_attn itself.

5. Concrete causes in your case

Let’s spell them out clearly and tie them directly to your logs.

Cause A: pip build isolation hides Torch from FlashAttention

Torch is installed in C:\Users\Owner\anaconda3\envs\longcat-video\lib\site-packages.
Pip’s build env lives under C:\Users\Owner\AppData\Local\Temp\pip-build-env-....
FlashAttention’s setup is executed inside this temp env when pip runs get_requires_for_build_wheel. It tries import torch and fails because Torch is not installed there.
Result: ModuleNotFoundError: No module named 'torch' and Failed to build 'flash_attn'.

Exactly this situation is documented in:

FlashAttention’s PyPI page, with the warning and recommended --no-build-isolation fix. (PyPI)
GitHub issues #309, #1920, etc., where users see identical logs and are told to either:
- install Torch first and then
- use pip install flash-attn --no-build-isolation. (GitHub)

Cause B: `flash_attn` has no ready-made wheel for your Windows platform by default

On Windows, pip often cannot find a prebuilt wheel on PyPI for your exact combination (PyTorch 2.6.0 + CUDA 12.4 + Python 3.10 + flash_attn 2.7.4.post1). It then tries to build from source.
Building from source on Windows:
- Needs MSVC, CUDA toolkit, Ninja, correct environment variables,
- Is known to be error-prone and slow. (GitHub)

So even after you fix build isolation, you may still hit C++/CUDA build errors unless you either:

install a matching prebuilt wheel, or
move to Linux/WSL2 and follow the official instructions there, or
skip FlashAttention entirely and change LongCat’s attention backend.

6. Solutions you can realistically choose from

There is no single “magic” fix; there are a few solid paths. I’ll list them from “least invasive” to “most robust”.

Option 1 — Try the official fix: build in your real env (no isolation)

This directly addresses the ModuleNotFoundError: No module named 'torch'.

Activate your env:
```
conda activate longcat-video
```

Make sure basic build tools are up to date:

python -m pip install --upgrade pip setuptools wheel

Install FlashAttention without build isolation:
```
python -m pip install --no-build-isolation "flash-attn==2.7.4.post1"
```
or (same thing, different spelling):
```
python -m pip install --no-build-isolation "flash_attn==2.7.4.post1"
```
This is exactly what the FlashAttention docs and multiple answers recommend when you see the “No module named ‘torch’” error. (PyPI)

If that succeeds, then:

python -m pip install -r requirements.txt

What can still go wrong:

On Windows, this may now progress past the “torch missing” phase and then fail with MSVC/NVCC errors, RAM issues, etc. Guides warn that building from source can take 1–3+ hours and is sensitive to toolchain setup. (Reddit)

If your goal is to experiment quickly and not debug compilers, you may prefer Options 2 or 3 instead.

Option 2 — Install a prebuilt FlashAttention wheel for your exact stack

This avoids compiling on Windows entirely.

There is a dedicated repo mjun0812/flash-attention-prebuild-wheels that publishes prebuilt wheels for Linux and Windows. The latest release (v0.4.19) explicitly lists Windows wheels for:

FlashAttention: 2.7.4.post1, 2.8.3
Python: 3.10, 3.11, 3.12, 3.13
PyTorch: 2.5, 2.6, 2.7, 2.8, 2.9
CUDA: 12.4, 12.6, 13.0 (GitHub)

That perfectly matches:

Python 3.10
Torch 2.6.0+cu124
CUDA 12.4
FlashAttention 2.7.4.post1

Typical procedure:

Stay with your current env:
```
conda activate longcat-video
```
Visit the v0.4.19 release page for mjun0812/flash-attention-prebuild-wheels. (GitHub)
Download the .whl whose filename encodes:
- flash_attn-2.7.4.post1
- cp310 (Python 3.10)
- torch2.6 (or similar)
- cu124
- win_amd64

Install it directly:

python -m pip install "C:\path\to\flash_attn-2.7.4.post1-...-cp310-cp310-win_amd64.whl"

Verify:

python -c "import flash_attn; print('flash_attn OK')"

Then:

python -m pip install -r requirements.txt

This way, pip sees that flash-attn==2.7.4.post1 is already installed and does not try to build it again.

Alternatives:

Other Windows wheel sources mentioned online include:
- kingbri1/flash-attention Windows builds. (CSDN Blog)
- flash_attn_windows (petermg) announced on Reddit for building and publishing wheels. (Reddit)

Tradeoffs:

You must ensure the wheel’s Python, Torch, and CUDA versions match your environment exactly. If they don’t, you’ll get import or runtime errors.
You’re relying on community-built wheels rather than official PyPI wheels, but this is a common route people take to get FlashAttention working on Windows. (GitHub)

Option 3 — Skip FlashAttention entirely and use a different backend

LongCat-Video’s README explicitly says: (GitHub)

FlashAttention-2 is enabled in the model config by default; you can also change the model config (./weights/LongCat-Video/dit/config.json) to use FlashAttention-3 or xformers once installed.

And in practice you can also design it to fall back to PyTorch’s native attention (SDPA) if you are okay with slower generation.

Steps:

Edit requirements.txt in the LongCat repo and comment out the FlashAttention line:
```
# flash-attn==2.7.4.post1
```

Install everything else:

conda activate longcat-video
python -m pip install -r requirements.txt

After you download the model weights (via huggingface-cli as in README), open:
```
./weights/LongCat-Video/dit/config.json
```
Look for fields controlling attention backend (naming may vary, e.g. attn_backend, use_flash_attn). Change them so they do not expect FlashAttention:
- If there is a "backend" or "attn_backend" key set to "flash_attn_2", change it to something like "native" or "pytorch" (or "xformers" if you install xFormers and it has Windows wheels for your Torch/CUDA).
- Some configs may have "use_flash_attn": true; change to false.
The exact key names depend on their implementation, but the README confirms that the backend is configurable and that FlashAttention-2 is just the default. (GitHub)

Run the demos as usual:

torchrun run_demo_text_to_video.py --checkpoint_dir=./weights/LongCat-Video --enable_compile

Downsides:

Generation will be slower, especially for long high-res videos, because you’re not using fused attention kernels.
But for testing, experimentation, and medium-length videos, this is usually perfectly acceptable and significantly easier than wrestling with FlashAttention on Windows.

This is often the most pragmatic tradeoff: you get LongCat working now, and you can always revisit FlashAttention later.

Option 4 — Use Linux or WSL2 and follow the README literally

LongCat-Video and FlashAttention are primarily tuned for Linux + CUDA. The README you’re following assumes that environment. (GitHub)

A very robust approach is:

Use a Linux environment (bare metal or WSL2 with GPU passthrough).
Install CUDA 12.4 and the exact LongCat stack:
- Python 3.10
- torch 2.6.0+cu124, torchvision, torchaudio
- pip install flash-attn==2.7.4.post1 --no-build-isolation
- pip install -r requirements.txt.
Optionally, use a preconfigured guide such as the DigitalOcean tutorial, which walks through setting up LongCat-Video on their GPU droplets with the right drivers/tooling. (DigitalOcean)

Pros:

You’re running in the environment the authors and FlashAttention maintainers actually test.
Fewer platform-specific surprises.

Cons:

Requires either a dual-boot/WSL2 setup or a remote Linux machine.

7. Minor side notes about your environment

Not blockers now, but worth being aware of:

You installed:
- psutil-7.1.3 manually, while requirements.txt wants psutil==6.0.0.
- numpy-2.1.2, while requirements.txt later pulls numpy==1.26.4.

Pip will try to adjust these during pip install -r requirements.txt. This is usually fine, but mixing many manual installs with a strict requirements.txt can sometimes cause confusing version downgrades or binary-ABI mismatches (especially for numpy). Keeping a clean environment that only follows the README plus any deliberate change (like removing FlashAttention) reduces noise when you debug.

8. Short recap (causes and solutions)

Causes

The ModuleNotFoundError: No module named 'torch' during flash_attn install is caused by pip’s build isolation: FlashAttention’s build runs inside a temporary env that doesn’t have Torch, so import torch fails, even though Torch is installed in your Conda env. (PyPI)
On Windows, pip often needs to compile FlashAttention from source because there’s no official wheel for your exact Python/Torch/CUDA combination on PyPI, and that compilation is complex and fragile. (GitHub)

Solutions

Option 1: Try the official fix:
python -m pip install --no-build-isolation flash-attn==2.7.4.post1 in your longcat-video env (then pip install -r requirements.txt). This fixes the “torch not found” piece but may still hit Windows compilation issues. (PyPI)
Option 2: Use a prebuilt Windows wheel for your exact stack from mjun0812/flash-attention-prebuild-wheels (FlashAttention 2.7.4.post1, Python 3.10, Torch 2.6, CUDA 12.4), install it via pip install path\to\wheel.whl, then run pip install -r requirements.txt. (GitHub)
Option 3: Skip FlashAttention entirely:
- Remove flash-attn from requirements.txt,
- Change ./weights/LongCat-Video/dit/config.json to use a non-FlashAttention backend (native or xFormers),
- Accept slower but much simpler installs on Windows. (GitHub)
Option 4: Run LongCat-Video in a Linux/WSL2 environment (local or cloud) where the exact torch + FlashAttention combo is officially tested, following the README as-is. (GitHub)

Topic		Replies	Views
ERROR: Could not find a version that satisfies the requirement torch==1.7.1+cpu Beginners	17	25288	December 15, 2020
How to install flash attention on HF gradio space Spaces	3	3253	June 16, 2024
Transformers suddenly complaining about pytorch? 🤗Transformers	2	8820	August 3, 2021
Build fail due to pip install torch Spaces	1	1140	May 7, 2022
Phi3 Mini 4k Instruct Flash Attention not found 🤗Transformers	4	5289	May 11, 2024