Training CausalLM to imitate Seq2SeqModel

I want to train a causal language model to imitate a sequence to sequence model. To imitate the sequence to sequence model, I need to only take the trailing output text after the input text.

Let say for example,

input_text = "The food is spicy. From the previous text, it is inferred that the text's sentiment is:"
output_text = "Negative"

# The food is spicy. From the previous text, it is inferred that the text's sentiment is: Negative
tokenized_inputs = tokenizer(input_text + ' ' + output_text,**tokenize_args)

This will be a problem when I need to evaluate and compute the metrics of my model in each epoch, since the ‘DataCollatorForLanguageModel’ will copy the input text to the label. I want it to be, when the evaluation phase, the model input is just the input_text and the label is the output_text. Is there any way I can make this happen?

1 Like