Using EncoderDecoderModel

Hi,

EncoderDecoderModel is meant to combine any bidirectional text encoder (e.g. BERT) with any autoregressive text decoder (e.g. GPT2). We’re planning to add a VisionEncoderDecoderModel (recently we’ve added SpeechEncoderDecoderModel, which allows you to combine any speech autoencoding model such as Wav2Vec2 with any autoregressive text decoder).

Feel free to contribute this if you are interested!