Support for LLaMA in EncoderDecoder framework

I鈥檓 trying to use LLaMA as a drop-in replacement for GPT2 in my ViT-GPT2 model.

After seeing issue Using FNet model in Encoder Decoder Models 路 Issue #22308 路 huggingface/transformers 路 GitHub, it seems like HuggingFace doesn鈥檛 plan to support future models in the EncoderDecoder framework and I should adapt the model to suit my own needs.

I鈥檓 planning to follow the steps described in Trying to add support for GPT2 as decoder in EncoderDecoder model 路 Issue #4483 路 huggingface/transformers 路 GitHub

Are there any gotchas I should know about?