How does the Trainer work for Text Generation?

MilanKalkenings · August 11, 2021, 2:34pm

Hi all,
I am new to huggingface and the task of text generation.
I’d like to to create my own train-eval loop to finetune text generation model based on the following checkpoint: dbmdz/german-gpt2 · Hugging Face

I found a very good tutorial on how to do that using the Trainer class (https://www.philschmid.de/fine-tune-a-non-english-gpt-2-model-with-huggingface).
The blog lead to some questions:

How exactly do we train text generation models under the boilerplate, if we want to use our own loop for full control instead of the fancy huggingface trainer?

How does the Trainer class decide how to train the model?

Where can I find out which training strategy is used by the Trainer?

Thank you for your help! =)

Topic		Replies	Views
Boilerplate for Trainer using torch.distributed Beginners	4	2041	January 11, 2022
Help with autotrain/LLM finetuning please Beginners	3	2143	August 11, 2023
GPT-2 fine-tuning Beginners	0	1610	June 12, 2023
What does hugging face trainer do special? Beginners	1	175	July 13, 2024
Whats happening in the SFT trainer? Beginners	13	2530	January 20, 2025

How does the Trainer work for Text Generation?

Related topics