How to access raw attention logits?

eigenfelix · March 10, 2024, 12:50pm

I would like to get the attention logits (i.e. the values of QK^T) for al layers of a transformer model. In other words - the pre-softmax values. Is there a simple way to somehow get them using the API?

nielsr · March 10, 2024, 7:21pm

Hi,

The Transformers library supports passing output_attentions to the forward of any model. However, these are the post-softmax values. In order to get the pre-softmax values, one would need to fork the library to tweak the model outputs or use PyTorch hooks.

eigenfelix · March 11, 2024, 9:50am

Thanks, that’s the conclusion I’ve reached
Is there a tutorial about using hooks I can learn from?

Topic		Replies	Views
Getting output attentions for encoder_attention decoder layers 🤗Transformers	0	362	October 24, 2020
Output_attention = True after downloading a model Beginners	2	4920	August 29, 2020
Access and modify attention weights at runtime Beginners	0	2156	August 25, 2021
Get output embeddings out of a transformer model 🤗Transformers	4	4080	July 20, 2021
Getting self-attention values of the GPT2LMHead model before softmax Intermediate	0	279	February 22, 2024

How to access raw attention logits?

Related topics