Access and modify attention weights at runtime

  1. Is it possible to access/extract the attention weights for a given forward pass?

This seems possible by using the output_attentions=True argument, like so:

from transformers import BertTokenizer, BertModel

model = BertModel.from_pretrained('bert-base-uncased', output_attentions=True)
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased', do_lower_case=True)
inputs = tokenizer.encode_plus("The cat sat on the mat", "The cat lay on the rug", return_tensors='pt', add_special_tokens=True)
input_ids = inputs['input_ids']
attention = model(input_ids, token_type_ids=token_type_ids)[-1]
  1. Relatedly, is it possible to modify the attention weights during run-time?

I have found no method to do this in transformers. As far as I can tell this package isn’t meant for doing something like this, but rather training new fine-tuned models in a particular way (e.g. adding/changing and then training the final output layers on a specific task), which is not what I want.

1 Like