Forward-looking or left-context attention mask (left-to-right) generation with BertGeneration and RobertaForCausalLM

claartje-barkhof · October 26, 2020, 12:46pm

Oh, and it seems that @patrickvonplaten implemented / is involved with these models, maybe you could point me to where this is happening? That would be very helpful Thanks in advance.

Topic		Replies	Views
Causal masks in BERT vs. GPT2 Intermediate	4	2713	December 30, 2022
Longformer's attention_mask Beginners	0	259	August 30, 2020
Quick question on attention masking in transformer models Models	0	121	January 8, 2025
Modification of self attention in BERT without pretraining Research	1	362	June 15, 2023
Optimal methods to monitor attention matrices when doing training/inference using BERT-type models Intermediate	2	709	September 11, 2021

Forward-looking or left-context attention mask (left-to-right) generation with BertGeneration and RobertaForCausalLM

Related topics