How to train LLM only on response

Hello, how do I train the model only on responses rather than prompt and response?

Is it just a matter of attention masks?

1 Like

Hi,

The easiest is to use the SFTTrainer of trl, combined with the DataCollatorForCompletionOnlyLM. The latter allows to only train on responses, and not on the prompts.

It’s brand new, we’re adding docs for it here: Add `DataCollatorForCompletionOnlyLM` in the docs by younesbelkada · Pull Request #565 · lvwerra/trl · GitHub

2 Likes

very interesting, thanks!