Fine-tuning T5 with Trainer for novel task

Hi guys,

I am trying to fine-tune T5 with Huggingface’s Trainer class, trying to recycle as much training code as possible.

Yet I am wondering what the Trainer.train() method actually does. In the T5 paper the authors mention three fine-tuning methods that they used (§3.5.1):

  • training only additional adapter layers
  • gradually unfreezing the model and training more and more layers
  • training the whole model right away

What does the Huggingface Trainer.train() do? And is there a simple way of switching between strategies?