GPT-2 custom loss

I am using a GPT-2 model to do generate text. During training I feed in the model a concatenation of context+target and during inference, I pass in the context, and the model predicts context+target. In training, I would like to modify the loss function to calculate the loss between the logits and the target only, i dont want to have the loss for the context predictions. Any tips on how I could do this? It would be great if you could point me to some resources that might help me do this.