QLoRA Llama2 additional special tokens

rarescr · August 9, 2023, 11:18am

I am trying to fine-tune the meta-llama/Llama-2-7b-hf model on a recipe dataset using QLoRA and SFTTrainer. My dataset contains special tokens (such as <RECIPE_TITLE>, <END_TITLE>, , <END_STEPS>, etc.) which helps with structuring the recipes. During fine-tuning I have added these additional tokens to the tokenizer:

special_tokens_dict = {‘additional_special_tokens’: [“<RECIPE_TITLE>”, “<END_TITLE>”, “”, “<END_INGREDIENTS>”, “”, “<END_STEPS>”], ‘pad_token’: “”}
tokenizer.add_special_tokens(special_tokens_dict)

I also resized the token embeddings for the model so that it matches the length of the tokenizer. However, the fine-tuned model predicts all these newly added tokens in the right places (the generated recipe is well-structured), but it predicts these tokens through a combination of token ids, not utilizing the additional token ids.

From my knowledge, LoRA does not automatically update the embedding matrix, so i made sure to specify this in the lora config:

peft_config = LoraConfig(
lora_alpha=lora_alpha,
lora_dropout=lora_dropout,
r=lora_r,
bias=“none”,
task_type=“CAUSAL_LM”,
target_modules=[“q_proj”, “v_proj”, “k_proj”],
modules_to_save=[“embed_tokens”, “lm_head”],
)

What is the reason behind the model not being able to learn the embeddings of the newly added tokens?

g58892881 · October 29, 2023, 4:11pm

hey, did you manage to solve this?

jadidi · November 23, 2023, 6:19pm

I have my tokens in a list and use tokenizer.add_tokens(new_tokens) instead and it works properly.

Topic		Replies	Views
LoRA fine-tuning and special tokens Beginners	0	2170	August 13, 2023
Special tokens & Embeddings requries grad? Models	2	46	February 12, 2025
Loading pre-trained models with AddedTokens 🤗Transformers	2	749	October 14, 2024
Please save me : GPT like model Generation gone wrong 🤗Transformers	0	54	July 4, 2024
Llama2 fine-tunning with PEFT QLora and testing the model 🤗Transformers	13	15235	December 21, 2023

QLoRA Llama2 additional special tokens

Related topics