How to finetune/instruction-tune a large language model on a QA corpus?

I would like to finetune a large language model which was trained with causal language modelling. The fine-tuning is planned to be done on a question-answering (or instruction tuning) corpus. Therefore, we don’t want to train the model to complete the question/instruction, but only to train it to generate the answer. How to achieve this with Huggingface Transformers package?

If I load the pretrained LLM with something like GPT2ForQuestionAnswering, it will say “Some weights of GPT2ForQuestionAnswering were not initialized from the model checkpoint at gpt2 and are newly initialized: [‘qa_outputs.weight’, ‘qa_outputs.bias’]”. But actually, what we want, is to use the exact same parameters of the LLM for causal language modelling rather than changing the output layers.