Hello!
I’m trying to create a RewardModel (RM) from gpt2. One thing that has me a little puzzled is the number of labels that should be supplied to the AutoModelForSequenceClassification
. Logically, I would think it should be 1
, as we need a single score for both of the chosen and rejected samples. Then we can use pairwise loss to compute a score between them.
But in the example shown here, it would be 2 - the default for GPT2 Sequence Classification:
from peft import LoraConfig, task_type
from transformers import AutoModelForSequenceClassification, AutoTokenizer, TrainingArguments
from trl import RewardTrainer
model = AutoModelForSequenceClassification.from_pretrained("gpt2")
peft_config = LoraConfig(
task_type=TaskType.SEQ_CLS,
inference_mode=False,
r=8,
lora_alpha=32,
lora_dropout=0.1,
)
...
trainer = RewardTrainer(
model=model,
args=training_args,
tokenizer=tokenizer,
train_dataset=dataset,
peft_config=peft_config,
)
trainer.train()
Maybe it does not matter? As the RewardTrainer
is not supplying any labels, so no loss is being computed in the AutoModelForSequenceClassification
.