Fine-tuning T5 on Tensorflow

Rocketknight1 · November 29, 2021, 1:39pm

Hi @mazerte, sorry for the delay in replying! This is one of those cases where I’d actually recommend trying our new “internal loss” method. For more complex models like Seq2Seq, getting the right Keras losses can be hard - it’s possible, but it can require a lot of knowledge and some hacky code. Instead, just let our model compute loss for you! To do that you should do two things:

Move the labels to the input dictionary so that they’re visible to the model on the forward pass, like so:

tf_train = inputs.to_tf_dataset(
  columns=["attention_mask", "input_ids", 'decoder_input_ids', 'labels'],
  shuffle=True,
  collate_fn=data_collator,
  batch_size=batch_size,
)

Remove the loss argument to compile(). Note that right now, we don’t support Keras metrics when using the internal loss, but this is an area of very active development - that will hopefully change soon!

model.compile(
  optimizer=optimizer
)

If you make these two changes, your model should train successfully. We recommend this method whenever you’re not sure of which loss to use.

Topic		Replies	Views
How to train T5 with Tensorflow Beginners	8	4917	October 27, 2022
How to train TFT5ForConditionalGeneration model? 🤗Transformers	5	3332	November 21, 2020
Training the t5 Beginners	4	1324	August 16, 2022
Error in model.prepare_tf_dataset Beginners	4	250	June 14, 2024
T5 Model Problems - Constant Loss (doesn't go down) Models	2	1588	August 18, 2023

Fine-tuning T5 on Tensorflow

Related topics