Loss in a Seq2Seq task

Task: Generate a response given a text input (using fine-tune the model)
Model: gpt2-xl
Doubt: How to calculate the loss at sentence level?

My approach: Generating the next token_id using model(input_ids). Then calculating the cross-entropy loss in between { softmax(output.logits), expected token_id } followed by loss.backward() and optimizer.step(). This approach works but overall time consumption is too high. Is there other approaches that helps to get overall loss over the generated sequence using model.generate(input_ids)?

Thanks,