Hi,
I’m wondering about what the correct format is for the “prompt” field for ORPO on autotrain? I see in this example that the dataset (distilabel-capybara-dpo-7k-binarized) used that the prompt has been formatted using chatml format already - does that mean that I should similarly format my prompts? What about if I don’t want to use the chatml format - should I format the prompt accordingly?
Also, the documentation for ORPO states that the columns needed for Reward/ORPO trainer is just the text and rejected text columns which is a bit confusing as I get an error if I try not to supply the “Prompt” field.