Shifting ids to the right when training GPT-2 on text generation?


I am slightly confused how to exactly prepare the training data for GPT-2 text generation.

In order to train you have to provide input_ids (inputs) and labels (outputs). Both are supposed to be lists of token indices. This is the easy part.

Question: Are inputs_ids and labels supposed to be absolutely identical, or are the labels supposed to be input_ids shifted one element to the right?


During training, the labels are shifted inside the models (see the doc) so you should pass labels equal to input_ids.

Thanks! Is this true for the TFGPT2LMHeadModel, too?