About the Cross-attention Layer Shape in Encoder-Decoder Model

I found the codes in EncoderDecoderModel class that maps the encoder hidden state size to decoder hidden state size in the link here. The problem solved.

1 Like