Any idea on why flash attention installation with AMD gpu results in metadata-generation-failed?

I’m trying to run my fine-tuned model in setonix(supercomputer with AMD MI250 GPUs). But for some reason, I always end up with errors like metadata generation failed with the flash attention package.

1 Like

In an environment without CUDA I do this.

import subprocess
subprocess.run('pip install flash-attn --no-build-isolation', env={'FLASH_ATTENTION_SKIP_CUDA_BUILD': "TRUE"}, shell=True)
1 Like