Hi guys,
I am trying to fine-tune T5 with Huggingface’s Trainer class, trying to recycle as much training code as possible.
Yet I am wondering what the Trainer.train()
method actually does. In the T5 paper the authors mention three fine-tuning methods that they used (§3.5.1):
- training only additional adapter layers
- gradually unfreezing the model and training more and more layers
- training the whole model right away
What does the Huggingface Trainer.train()
do? And is there a simple way of switching between strategies?