Where does causal mask get generated for T5 decoder?

I am trying to figure out the causal mask implementation for T5 encoder-decoder model. The docstring for decoder_attention_mask says that “Causal mask will also be used by default”. However, I do not find any code in modeling_t5.py that is generating the causal mask. Can somebody point me to the place where causal mask is generated for T5 decoder?

Found it. The causal mask gets created in get_extended_attention_mask here: transformers/src/transformers/modeling_utils.py at main · huggingface/transformers · GitHub

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.