Fine tuning T5 Encoder and T5 Decoder separately

ND1 · February 5, 2022, 4:46pm

Is it possible to create instances for T5Encoder and T5Decoders separately. I would like to fine tune T5Encoder with Masked Lang Objective on company specific dataset and then use it with T5Decoder for text generation.
I have tried and searched exclusively on Huggingface for these classes and/or their documentation but could not find so.

Saptarshi7 · May 6, 2024, 10:05pm

Hey (@ND1 ), did you find anything on this?.. I tried looking at the same but couldn’t find anything. I mean, we can extract the T5 encoder and decoder separately. However, trying to tune them with AutoModelForMaskedLM is not possible I guess.

EDIT…

I think I was right. T5 is a Seq-2-Seq model, which means that for language modelling at least, it needs to be trained in that fashion.

Another reason why you can’t train the encoder/decoder separately for LM is that, for the encoder, you’d likely want to use AutoModelForMaskedLM but T5 is not supported by this class. This makes sense if you think about it.

MLM models like BERT use [MASK] tokens whereas T5’s encoder uses span replacement tokens which are slightly different than [MASK].

So, I would recommend using this script: transformers/examples/flax/language-modeling/run_t5_mlm_flax.py at main · huggingface/transformers · GitHub

which will get you up and running with training T5 in S2S mode. Then, you can throw away the decoder and use the encoder for downstream tasks.

Hope this helps

Topic		Replies	Views
T5 as Decoder for OCR Models	8	866	November 20, 2024
Train T5 decoder only on a different language Models	0	453	March 16, 2021
Training T5 on mlm task from scratch 🤗Transformers	4	3275	July 29, 2022
Pretrain encoder of tf T5 model Intermediate	0	531	October 19, 2020
Question on HuggingFace's T5 documenation 🤗Transformers	0	321	May 18, 2023

Fine tuning T5 Encoder and T5 Decoder separately

Related topics