Access and modify attention weights at runtime

tfburns · August 25, 2021, 8:57am

Is it possible to access/extract the attention weights for a given forward pass?

This seems possible by using the output_attentions=True argument, like so:

from transformers import BertTokenizer, BertModel

model = BertModel.from_pretrained('bert-base-uncased', output_attentions=True)
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased', do_lower_case=True)
inputs = tokenizer.encode_plus("The cat sat on the mat", "The cat lay on the rug", return_tensors='pt', add_special_tokens=True)
input_ids = inputs['input_ids']
attention = model(input_ids, token_type_ids=token_type_ids)[-1]

Relatedly, is it possible to modify the attention weights during run-time?

I have found no method to do this in transformers. As far as I can tell this package isn’t meant for doing something like this, but rather training new fine-tuned models in a particular way (e.g. adding/changing and then training the final output layers on a specific task), which is not what I want.

Topic		Replies	Views
Extract Attention Weights from a Specific Layer and Head Efficiently 🤗Transformers	1	162	March 25, 2025
Attention weights transfer but different classes Models	0	221	February 21, 2023
Overwrite attention heads in BartForConditionalGeneration Beginners	1	294	August 25, 2021
Output_attention = True after downloading a model Beginners	2	4769	August 29, 2020
How to Modifiy a model? Beginners	1	22	April 4, 2025

Access and modify attention weights at runtime

Related topics