I am looking to overwrite the attention heads in the Bart model, following the below process:
- Run the model on an article with the keyword parameter: “Covid”
- Save the encoder/decoder heads for this article
- Run the model on another article, also with the keyword parameter: “Covid”
- As a proxy for making this model ‘topic-aware’, I will insert the “Covid” attention heads generated in step 2 and insert the attention heads for the model run in step 3
- Model will generate a new ‘topic-aware’ summary for the article as the attention heads are ‘trained’ on the topic key-word ‘covid’
Note: The above is extremely preliminary, we will be looking to train the attention heads & model on more data for each key-word in the future.
article = """Covid-19 is a global pandemic" model_name = "facebook/bart-large-cnn" config = BartConfig.from_pretrained(model_name, output_hidden_states=True, output_attention=True) tokenizer = AutoTokenizer.from_pretrained(model_name) inputs = tokenizer(article, padding=True, truncation=True, return_tensors="pt") model = AutoModel.from_pretrained(model_name) model.config.output_attentions = True outputs = model(**inputs) summary = tokenizer.decode(outputs) covid_encoder_attention = outputs.encoder_attentions covid_decoder_attention = outputs.decoder_attentions # Repeat model run with new article and insert covid_encoder_attention and/or covid_decoder_attention for new run