GPT2 Conditional Text Generation

I would like to fine-tune the GPT2 model on EmpatheticDialogues doing kind of conditional generation as like in this paper:
What concerns me is the format of the input_ids and labels in the forward function.
I think that concatenating the input with the target is a good solution separating them with a special token
(e.g. "hi! how are you? I am fine!)
However I am not sure what to do with the labels. Shall I mask all the input part and the padded tokens with -100 index and leave only the target part as is? or shall I mask with -100 only the padded tokens?

Thank you in advance :slight_smile: