Instruction tuning llm

ron5569 · January 1, 2024, 8:45am

I want to fine-tune a LLM with an instructions dataset, which consists of pairs of prompts and completions. I have seen a lot of tutorials on how to fine-tune LLMs with supervised datasets. Almost all of them use Trainer or SFTTrainer from Hugging Face.

The strange thing that shocked me is that there is no difference between this fine-tuning and the pretraining process; in both cases, the model tries to predict the next token for both the prompt and the completion.

Intuitively, I would prefer to backpropagate only the tokens of the completion and not the prompt itself. In fact, I believe the next token prediction should only start at the completion stage. Does that make sense?

Does anyone know of any library that can perform training as I expect?

nielsr · January 1, 2024, 6:15pm

Hi,

That’s supported in the TRL library using the DataCollatorForCompletionOnlyLM class: Supervised Fine-tuning Trainer

rangehow · January 3, 2024, 3:33am

The requirement you want may be needs to deal with in data preprocess procedure. To the best of my knowledge,you can manually replace prompts part in lable with -100 which is a special token that would be ignore loss calculation by torch backend(most third party llm finetune repo do things like this, like llama-recipes officially supported by llama or ‘llamafactory’ a very famous llm factory).

cungnlp · January 23, 2024, 4:49pm

Can you help me check how training data is generated when entering the “text” column of the dataset into the trainer() function? I mean test with code. thanks

nielsr · January 23, 2024, 9:31pm

@cungnlp this can be checked by doing trainer.get_train_dataloader. You can then check some samples of the dataloader:

train_dataloader = trainer.get_train_dataloader()

batch = next(iter(train_dataloader))
print(batch)

cungnlp · January 24, 2024, 3:03am

Thank you for your help!

Vào Th 4, 24 thg 1, 2024 vào lúc 04:41 Niels Rogge via Hugging Face Forums <notifications@hellohellohello.discoursemail.com> đã viết:

cungnlp · January 29, 2024, 12:34pm

I checked, but it seems the dataset doesn’t look like “shift right one token”. Can you explain to me why? By the way, can you give me the code on how to add your own dataset with 3 columns: input_ids, attention_mask, label.
We look forward to receiving your feedback!

Vào Th 4, 24 thg 1, 2024 vào lúc 10:03 Nguyen Cung <cungmachinelearning@gmail.com> đã viết:

nielsr · May 5, 2024, 8:39am

Hi,

For LLMs in the Transformers library, the labels are typically just a copy of the input_ids (with padding tokens replaced by -100, the ignore index of the cross-entropy loss in PyTorch). The model will internally shift the labels one position to the right.

cungnlp · May 8, 2024, 6:51am

Thanks for your help, I understood the problem. Wish you an effective working day.

Vào CN, 5 thg 5, 2024 vào lúc 15:50 Niels Rogge via Hugging Face Forums <notifications@hellohellohello.discoursemail.com> đã viết:

Topic		Replies	Views
How do LLMs identify generation start point during fine-tuning? 🤗Transformers	5	109	September 9, 2024
Fine tune with SFTTrainer Intermediate	17	14124	September 12, 2024
Fine-tuning queries Beginners	0	39	February 20, 2025
Domain adaptation fine tune VS instruction_tuned 🤗Transformers	2	3121	January 21, 2024
[LMM Fine Tuning] Supervised Fine Tuning Trainer (SFTTrainer) vs transformers Trainer Intermediate	1	1674	November 29, 2023

Instruction tuning llm

Related topics