I am using a GPT-2 model to do generate text. During training I feed in the model a concatenation of context+target and during inference, I pass in the context, and the model predicts context+target. In training, I would like to modify the loss function to calculate the loss between the logits and the target only, i dont want to have the loss for the context predictions. Any tips on how I could do this? It would be great if you could point me to some resources that might help me do this.
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Can we get per word loss from the output of a GPT model | 0 | 364 | March 2, 2022 | |
Custom loss does not work | 2 | 40 | December 24, 2024 | |
How to calculate GPT-2 sentence loss (for each sentence) if batch has 2 or more sentences? | 1 | 831 | April 17, 2023 | |
Is there a way to get per word loss instead of the average loss for GPT model | 0 | 327 | March 7, 2022 | |
Use custom loss function for training ML task | 2 | 7128 | March 17, 2022 |