Hi all,
Is there currently a way to extract the attention attribute from a model such as GPT-2 and swap it with Flash-Attention?
Thank you,
Enrico
Hi all,
Is there currently a way to extract the attention attribute from a model such as GPT-2 and swap it with Flash-Attention?
Thank you,
Enrico