I am using a GPT-2 model to do generate text. During training I feed in the model a concatenation of context+target and during inference, I pass in the context, and the model predicts context+target. In training, I would like to modify the loss function to calculate the loss between the logits and the target only, i dont want to have the loss for the context predictions. Any tips on how I could do this? It would be great if you could point me to some resources that might help me do this.
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
How to update the GPT2 with loss which are provided from another separate module? | 0 | 271 | June 8, 2021 | |
Newbie Understanding GPT2 loss | 1 | 5273 | March 12, 2023 | |
Train GPT2 from scratch (Tensorflow) - Loss function | 1 | 2097 | July 21, 2021 | |
Train GPT2 from scratch (Tensorflow) - Loss function issue | 0 | 723 | March 11, 2021 | |
Is there a way to get per word loss instead of the average loss for GPT model | 0 | 336 | March 7, 2022 |