I am using a GPT-2 model to do generate text. During training I feed in the model a concatenation of context+target and during inference, I pass in the context, and the model predicts context+target. In training, I would like to modify the loss function to calculate the loss between the logits and the target only, i dont want to have the loss for the context predictions. Any tips on how I could do this? It would be great if you could point me to some resources that might help me do this.
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Newbie Understanding GPT2 loss | 1 | 5197 | March 12, 2023 | |
How to update the GPT2 with loss which are provided from another separate module? | 0 | 270 | June 8, 2021 | |
Train GPT2 from scratch (Tensorflow) - Loss function | 1 | 2092 | July 21, 2021 | |
Is there a way to get per word loss instead of the average loss for GPT model | 0 | 333 | March 7, 2022 | |
Can we get per word loss from the output of a GPT model | 0 | 367 | March 2, 2022 |