Hi,
You were already on the good way! The only “mistake” I see here is that GPT2 doesn’t have a CLS token. The CLS token is only defined for encoder-only Transformers such as BERT, RoBERTa. So in this case, the decoder start token can be set to the bos (beginning of sequence) token:
model.config.decoder_start_token_id = tokenizer.bos_token_id