Training RewardTrainer - Does the number of labels matter?

Hello!

I’m trying to create a RewardModel (RM) from gpt2. One thing that has me a little puzzled is the number of labels that should be supplied to the AutoModelForSequenceClassification. Logically, I would think it should be 1, as we need a single score for both of the chosen and rejected samples. Then we can use pairwise loss to compute a score between them.

But in the example shown here, it would be 2 - the default for GPT2 Sequence Classification:

from peft import LoraConfig, task_type
from transformers import AutoModelForSequenceClassification, AutoTokenizer, TrainingArguments
from trl import RewardTrainer

model = AutoModelForSequenceClassification.from_pretrained("gpt2")
peft_config = LoraConfig(
    task_type=TaskType.SEQ_CLS,
    inference_mode=False,
    r=8,
    lora_alpha=32,
    lora_dropout=0.1,
)

...

trainer = RewardTrainer(
    model=model,
    args=training_args,
    tokenizer=tokenizer,
    train_dataset=dataset,
    peft_config=peft_config,
)

trainer.train()

Maybe it does not matter? As the RewardTrainer is not supplying any labels, so no loss is being computed in the AutoModelForSequenceClassification.

1 Like