Basically, you can’t install flash attention in the requirements.txt because it hasn’t detected a torch installation yet.
Using subprocess to install does not work, and I haven’t been able to get it to work using the kernels library.
This approach (installing in requirements.txt using prebuilt wheels) is correct but fails because the wheel version is not correct (as of me writing this, spaces has python == 3.10, cuda == 12.3 by default).
The newest prebuilt wheels that satisfy these constrains is `https://github.com/Dao-AILab/flash-attention/releases/download/v2.8.3/flash_attn-2.8.3+cu12torch2.4cxx11abiFALSE-cp310-cp310-linux_x86_64.whl\\\` and you will need to force torch==2.4.1 in requirements.txt.
The version of the wheels can be WRONG when you read this, please check the python and cuda version from the build logs and find the correct wheels to use. If you don’t want to/ can’t do this, prebuild your wheels locally (hopefully in a seperate conda environement) and upload to your spaces repo for installation?
I genuinely hope this is not necessary