T5forConditionalGeneration

ashiishkarhade · September 15, 2020, 6:14am

Hello,
I am a newbie in using transformers and never used it before.
I want to know, what’s the difference between T5Model and T5forConditionalGeneration? Where are they used?

valhalla · September 15, 2020, 7:59am

Hi @ashiishkarhade
T5Model contains the encoder (stack of encoder layers) and decoder (stack of decoder layers) without any task specific heads. It returns the raw hidden states of the decoder as output.

T5ForConditionalGeneration also contains the encoder and decoder and adds an additional linear layer (lm_head) which takes the final hidden states of decoder and generates the next token.

For fine-tuning the model for seq2seq generation you should use T5ForConditionalGeneration, if you want to add some different task specific head then you can T5Model.

And almost all library models have this structure, a base model which returns raw hidden states and additional models with task specific heads(ForSequenceClassification, ForQuestionAnswering etc) on top of the base model.

ashiishkarhade · September 15, 2020, 8:31am

Thank you for the brief .

Topic		Replies	Views
T5 Model, T5 Encoder Model and T5 Model for Conditional Generation Beginners	1	1293	November 20, 2022
Flan-T5 / T5: what is the difference between AutoModelForSeq2SeqLM and T5ForConditionalGeneration Models	5	7332	February 2, 2023
Input format for T5 model in Question Answering task 🤗Transformers	0	747	February 3, 2023
Problem generating with T5ForConditionalGeneration on a custom task 🤗Transformers	2	40	January 26, 2025
T5forConditionalGeneration + classification Models	3	1271	December 13, 2020

T5forConditionalGeneration

Related topics