Phi3 Mini 4k Instruct Flash Attention not found

When loading “microsoft/Phi-3-mini-4k-instruct” using transformers.pipeline with “load_in_4bit=True” I get a warning stating:

WARNING:transformers_modules.microsoft.Phi-3-mini-4k-instruct.920b6cf52a79ecff578cc33f61922b23cbc88115.modeling_phi3:`flash-attention` package not found, consider installing for better performance: No module named 'flash_attn'.

WARNING:transformers_modules.microsoft.Phi-3-mini-4k-instruct.920b6cf52a79ecff578cc33f61922b23cbc88115.modeling_phi3:Current `flash-attenton` does not support `window_size`. Either upgrade or use `attn_implementation='eager'`.

Perhaps I have loaded it incorrectly? I have accelerate in my environment

+1 ı have same problem.

flash-attention package not found, consider installing for better performance: No module named ‘flash_attn’.
Current flash-attenton does not support window_size. Either upgrade or use attn_implementation='eager'.

microsoft/Phi-3-mini-128k-instruct"

1 Like

Solution for me was to install the missing package.

I thought flash attention was handled in accelerate.

But I think phi3 implements their own version using a specific package.

You can just do pip install flash-attn

1 Like

Collecting flash-attn
Using cached flash_attn-2.5.8.tar.gz (2.5 MB)
Installing build dependencies 
 done
Getting requirements to build wheel 
 error
error: subprocess-exited-with-error

note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.