How to update the GPT2 with loss which are provided from another separate module?

yananchen · June 8, 2021, 1:47am

Suppose I have N prompts(sentences) for generation. They are fed into GPT2 and we get the corresponding synthesis sentences. And I have a separate black box which can return loss given these synthesis samples. The black box is just another component. It is natural to think that , for every batch, GPT2 generate samples and get the loss with respect to the current GPT2, repeatedly. The goal of GPT2 is to reduce the loss in each update iteration.

What I want to do is use the loss from the black box to update the parameters of GPT2, at each batch.

The generation of GPT2 is quite simple, but how can I implement the idea of updating it with the loss? Is there any example for doing this ? Especially how to properly update the parameters? I mean should I update them equally, without any difference ?
Please give some thoughts, thanks.

Topic		Replies	Views
GPT-2 custom loss Models	0	491	July 18, 2022
Train GPT2 from scratch (Tensorflow) - Loss function 🤗Transformers	1	2092	July 21, 2021
Train GPT2 from scratch (Tensorflow) - Loss function issue Beginners	0	720	March 11, 2021
Is there a way to get per word loss instead of the average loss for GPT model 🤗Transformers	0	335	March 7, 2022
Newbie Understanding GPT2 loss 🤗Transformers	1	5211	March 12, 2023

How to update the GPT2 with loss which are provided from another separate module?

Related topics