Autotrain ORPO Dataset format

Hi,

I’m wondering about what the correct format is for the “prompt” field for ORPO on autotrain? I see in this example that the dataset (distilabel-capybara-dpo-7k-binarized) used that the prompt has been formatted using chatml format already - does that mean that I should similarly format my prompts? What about if I don’t want to use the chatml format - should I format the prompt accordingly?

Also, the documentation for ORPO states that the columns needed for Reward/ORPO trainer is just the text and rejected text columns which is a bit confusing as I get an error if I try not to supply the “Prompt” field.

hi. it should be same as dpo instead. ill fix the docs asap.

Thanks, and RE the prompt format - should it be the raw text or with the chatml (etc) format?