Autotrain-advanced LLM finetuning: issues with ORPO/DPO dataset format

hyouko · May 22, 2024, 10:39pm

Hey all. I’m trying to execute autotrain-advanced locally. I’ve formatted my JSONL files with my training examples based on the instructions here:

When I follow the setup for the SFT trainer, the suggested format works fine. When I try to use the ORPO or DPO trainers, however (with “rejected_text” and/or “prompt” keys), I run into trouble. I get variations on the following error:

Original column name autotrain_rejected_text not in the dataset. Current columns in the dataset: ['rejected_text', 'chosen']

There’s a number of things that I notice are odd here. The dataset I am supplying has text, not chosen - is that getting renamed on the by autotrain? Looking at the params passed to the launch command, I see text_column': 'autotrain_text', 'rejected_text_column': 'autotrain_rejected_text'. I patterned my config file after the ones in the github repo, so it has specific column_mapping values for these that seem to be ignored, no matter what I put in (obviously defaulting to text and rejected_text). I tried fooling it by specifically putting in autotrain_text and autotrain_rejected_text in the input file… no dice, it tells me those names are reserved.

I’ve tried the data both as CSV and JSONL; neither work. What am I missing? Is this functionality just bugged right now?

(Bonus question: can I explicitly feed it a validation dataset somehow? It seems to ignore everything except train.jsonl that I put in the data folder.)

If it helps: current running autotrain-advanced 0.7.107 via WSL2. I also tested this on my macbook air and got the same error, so I don’t think it is directly environment-related. I was able to train a working lora using the SFT trainer (with text as the only key in my jsonl dataset), so I know my install is not completely janked here. Google returns zero hits for my specific error message, which is honestly impressive.

abhishek · May 23, 2024, 4:24am

please show a few lines from your jsonl and the command used for training.

hyouko · May 23, 2024, 10:22am

Here’s a toy example of the JSONL format that yields the same behavior:

{"text":"<|user|>I don't know why you say goodbye...<|end|><|assistant|>I say hello!<|end|>","rejected_text":"<|user|>I don't know why you say goodbye...<|end|><|assistant|>I, too, say hello!<|end|>"}
{"text":"<|user|>This is the beginning of...<|end|><|assistant|>a beautiful friendship!<|end|>","rejected_text":"<|user|>This is the beginning of...<|end|><|assistant|>an ugly enmity!<|end|>"}
{"text":"<|user|>Gimme five bees for<|end|><|assistant|>a quarter, they used to say.<|end|>","rejected_text":"<|user|>Gimme five bees for<|end|><|assistant|>the beehive in my yard.<|end|>"}

And here’s an example of an attempt to configure it via command line params that produces the error:

autotrain llm \
--train \
--model microsoft/Phi-3-mini-4k-instruct \
--data-path ~/projects/phi-3-ft/data-phi-3-ft/ \
--lr 1e-4 \
--batch-size 1 \
--epochs 12 \
--trainer orpo \
--peft \
--project-name phi3-ft \
--merge_adapter

(Have redacted my HF token, obviously)

I notice that the example yaml config files for ORPO training also specify a prompt column (see: autotrain-advanced/configs/llm_finetuning/llama3-8b-dpo-qlora.yml at main · huggingface/autotrain-advanced · GitHub ); is that in error? The dataset formatting page suggests that ORPO only wants text and rejected_text.

hyouko · May 27, 2024, 2:32am

Any thoughts on things I could try here? I’m tempted to dive into the source code and see where these mystery column names are coming from.

The tool breaks at an earlier stage if I use any key other than text for the “chosen” text field. It seems to ignore what key I use in place rejected_text, which… maybe suggests that it’s not actually reading that field in the same way that it reads text?

abhishek · May 27, 2024, 5:18am

please see the docs. orpo needs: prompt, chosen and rejected. What is AutoTrain Advanced?

abhishek · May 27, 2024, 5:30am

you also seem to be ignoring column mapping

hyouko · May 27, 2024, 1:35pm

Thank you for your patience. I’m still struggling with this; is there a way to specify the column mappings via a command line parameter? I can see from the example config files how to specify them in YAML, but now when I try to run from a config file autotrain insists that it can’t find any data or scripts in the data directory (which contains just the train.jsonl file). The web/app interface appears to expect a JSON dict for the mappings, i.e. {"text":"text","rejected_text":"rejected_text"}; I will explore and see if I can get it to work in that environment.

If ORPO requires a “prompt” field, the data format documentation needs updating, as it specifically indicates that reward/ORPO trainer wants “text” and “rejected_text” without “prompt”; this is what I was basing my formatting on originally:

Topic		Replies	Views
Autotrain ORPO Error 500 Beginners	2	62	October 9, 2024
Autotrain ORPO Dataset format 🤗AutoTrain	4	276	October 8, 2024
Training data is not working Beginners	4	167	November 18, 2024
How do I format the column mapping on the autotrainer? Beginners	2	143	January 24, 2025
Issue with Creating Project in AutoTrain LLM: Internal Server Error (HTTP 500) 🤗AutoTrain	6	2976	December 27, 2023

Autotrain-advanced LLM finetuning: issues with ORPO/DPO dataset format

Related topics