DPOTrainer cannot train encoder_decoder vlm

In DPOTrainer’s compute_loss function, it requires “prompt_pixel_values” key in batch, but in default data_collator DPODataCollatorWithPadding, if your batch contains that key (i.e. “prompt_pixel_values”), it will raise a ValueError(“Unexpected key in batch ‘{k}’”), I don’t know how build my dataset so that i can train an encoder decoder vlm by DPOTrainer

1 Like

I can’t isolate the cause because the error message is so generic…
You could try a different dataset and model first, and then try to solve the problem. If it’s not a model-dependent problem, it’s a library or program problem.

Thanks for your reply. Here are more details of my code. My model and dataset are customized. The model is a subclass of BartPretrainedModel , and the dataset is a datasets.Dataset instance. The model accepts inputs such as "pixel_values" and "decoder_input_ids", and returns logits, etc. The sample of the dataset is a dictionary containing "chosen", "rejected", "images", and "prompt". In this case, dpotrainer will have KeyError prompt_pixel_values. I read the source code and found that this is because in the concatenate_inputs function, the "prompt_pixel_values" value of the batch is accessed.However I have also changed the "images" of the sample dictionary returned by the dataset into "prompt_pixel_values", in this case, ValueError("Unexpected key in batch '{k}'") will be raised from __call__ function in DPODataCollatorWithPadding

1 Like

I see. That’s the problem when operating a batch of HF libraries with a custom model. I’ve had trouble with that too.
Whether batch objects or argument objects, they are instances of existing classes in the library, so it’s hard to imitate them without using these classes.
There is a means to give up batch processing and for loop individual processes, but it is not a means that can be used in all cases.
This is a case where we would like to manually configure the arguments if possible, but we will need to do some research.