Sample weighting in DPOTrainer

Is there a way to set up sample weighting when computing loss in DPOTrainer? E.g. I know some preference pairs are more reliable so I would like to give them more weight.

1 Like