How to install flash attention on HF gradio space

I tried to put flash-attn in the requirements.txt file to install flash-attention on my space, but it gives error that torch is not installed.

I also tried to put torch above flash-attn but still couldn’t, probably torch is not installed yet.

Please help!

hi @nxphi47 ,

One option is to use a custom Dockerfile and install as a build step

Another option if you’re using Gradio/Streamlit SDK is to install at runtime

import subprocess
subprocess.run('pip install flash-attn --no-build-isolation', env={'FLASH_ATTENTION_SKIP_CUDA_BUILD': "TRUE"}, shell=True)

put this prebuilt whl inside your requirements.txt

https://github.com/Dao-AILab/flash-attention/releases/download/v2.5.9.post1/flash_attn-2.5.9.post1+cu118torch1.12cxx11abiFALSE-cp310-cp310-linux_x86_64.whl

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.