How to train T5 with Tensorflow

There are many examples showing how to train T5 in PyTorch, e.g.: , but none so far for Tensorflow. Many people have been asking on :hugs:transformers: .

Does anybody have a good notebook showing how to train T5 in Tensorflow?

Otherwise, I will try to translate @valhalla 's great notebook: to Tensorflow.


Thanks very much Patrick @patrickvonplaten!

In Suraj Patil’s notebook, he employed Pytorch Trainer to train T5.
At first, I didn’t know that we can use Trainer with Seq2Seq problems (according to “The Big Table of Tasks” which stated that Trainer does not yet support Translation / Summarization )

I will try to use TFTrainer for TF2 on Seq2Seq problems. If that doesn’t work, I think I will try to write custom loop in TF2.

You can use trainer for seq2seq as well, you’ll just need to write a different data collator which will return the expected arguments to the model.

Few things to note about the that notebook,
I wrote it before v3.0.0, few things have changed after that

  1. DatCollator is not a class anymore, so you won’t need to inherit from DataCollator when creating T2TDataCollator. Also collate_batch should be renamed to __call__.
  2. lm_lables is now deprecated, use labels instead.

Let me know if you run into problems using the notebook.


Any luck on those Tensorflow T5 notebook?

This might be related: How to train TFT5ForConditionalGeneration model?

Okey, I will start working on a T5 TF notebook showing how T5 can be fine-tuned on CNN / Daily Mail using the TF Trainer this week.


hey @patrickvonplaten i want to contribute a fully working TF T5 training/finetuning notebook, how do i do that?

1 Like

Hey @HarrisDePerceptron,

That sounds awesome! Usually people create a google colab and add it under community notebooks here:

Looking forward to your notebook :slight_smile:

1 Like

Hey everyone. We have recently contributed our community notebook that lets us train T5 using pure tensorflow 2. do checkout it out !!! For any issue you can log them to our offical repo. :hugs: :hugs:


Hello everyone.

I am struggling to find a working notebook to illustrate training T5 using keras.

@HarrisDePerceptron your notebook doesn’t seem to run at the moment.

@patrickvonplaten Can you point me to something useful?

This thread seemed to be leading in that direction, but the trail has gone cold.