Finetuning GPT2 with user defined loss

prajjwal1 · July 10, 2020, 2:04am

One caveat of using your own nn.Module with Trainer is is that save function checks for which kind of network is being passed by if is instance(self.model, PreTrainedModel) and if it is not (like nn.Module in this case or many cases if user define their own), the training stops. One thing that I’d like to propose is to have support for both and give user the warning that some functionalities won’t work which any module which inherits from PreTrainedModel provides.

So you’ll have to redefine save also. On top of that AutoModel.from_pretrained won’t directly work if you pass the path, since it expects the saved model to be an instance of PreTrainedModel, so you’ll have to manually use torch.load to load the weights.

Topic		Replies	Views
Finetuning GPT2 using Multiple GPU and Trainer 🤗Transformers	14	6793	May 22, 2023
Chapter 3 questions Course	149	10520	August 29, 2025
Fine tuning GPT2 tensorflow 🤗Transformers	0	79	June 24, 2024
Key Error 'loss' while fine tuning GPT-2 with the Trainer utility 🤗Transformers	9	7473	May 10, 2022
Chapter 7 questions Course	119	10407	July 10, 2025

Finetuning GPT2 with user defined loss

Related topics