I am using a GPT-2 model to do generate text. During training I feed in the model a concatenation of context+target and during inference, I pass in the context, and the model predicts context+target. In training, I would like to modify the loss function to calculate the loss between the logits and the target only, i dont want to have the loss for the context predictions. Any tips on how I could do this? It would be great if you could point me to some resources that might help me do this.
Related Topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Can we get per word loss from the output of a GPT model | 0 | 361 | March 2, 2022 | |
How to calculate GPT-2 sentence loss (for each sentence) if batch has 2 or more sentences? | 1 | 817 | April 17, 2023 | |
Is there a way to get per word loss instead of the average loss for GPT model | 0 | 322 | March 7, 2022 | |
Use custom loss function for training ML task | 2 | 6923 | March 17, 2022 | |
Difference between model.generate() and model() outputs | 2 | 1690 | March 3, 2024 |