Fusion-in-Decoder models

The original implementation is actually based on HuggingFace/Transformers. It is just a simple (yet very effective) encoder wrapper of T5.