Scalar Reward Model

BenatCambridge · April 7, 2025, 10:40pm

I have a generic question about reward model training for LLMs. I have an application scenario where (1) my input is natural language text and reward function is defined by scalar scores 0, 1, 2 etc. For this reason, it seems like in order to train my reward model I should use the TextClassification interface. However, (2) my input also has a “context-response” structure, and the scalar scores correspond to how well the response is wrt the context.

My question: Is TextClassification the best interface I can use? Ideally, I would like to train the reward model to predict the score for the response given the context, so perhaps I am looking for a conditional reward model if that exists?

John6666 · April 8, 2025, 7:34am

It looks like TextClassification with RLHF is fine.

system · April 9, 2025, 9:56pm

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.

Topic		Replies	Views
TRL Library (how to load the reward model and calculate score from some prompt answer pairs) 🤗Transformers	0	281	February 29, 2024
How to pass input to a Reward Model and make sense of its output? 🤗Transformers	1	390	March 8, 2024
PPO using TRL: optimal strategy for reward calculation? Research	1	923	December 20, 2023
General question about text classification Models Beginners	4	271	November 21, 2024
Training RewardTrainer - Does the number of labels matter? 🤗Transformers	0	18	February 13, 2025

Scalar Reward Model

Related topics