How to access raw attention logits?

I would like to get the attention logits (i.e. the values of QK^T) for al layers of a transformer model. In other words - the pre-softmax values. Is there a simple way to somehow get them using the API?

Hi,

The Transformers library supports passing output_attentions to the forward of any model. However, these are the post-softmax values. In order to get the pre-softmax values, one would need to fork the library to tweak the model outputs or use PyTorch hooks.

Thanks, that’s the conclusion I’ve reached :frowning:
Is there a tutorial about using hooks I can learn from?