QLoRA Llama2 additional special tokens

I am trying to fine-tune the meta-llama/Llama-2-7b-hf model on a recipe dataset using QLoRA and SFTTrainer. My dataset contains special tokens (such as <RECIPE_TITLE>, <END_TITLE>, , <END_STEPS>, etc.) which helps with structuring the recipes. During fine-tuning I have added these additional tokens to the tokenizer:

special_tokens_dict = {‘additional_special_tokens’: [“<RECIPE_TITLE>”, “<END_TITLE>”, “”, “<END_INGREDIENTS>”, “”, “<END_STEPS>”], ‘pad_token’: “”}

I also resized the token embeddings for the model so that it matches the length of the tokenizer. However, the fine-tuned model predicts all these newly added tokens in the right places (the generated recipe is well-structured), but it predicts these tokens through a combination of token ids, not utilizing the additional token ids.

From my knowledge, LoRA does not automatically update the embedding matrix, so i made sure to specify this in the lora config:

peft_config = LoraConfig(
target_modules=[“q_proj”, “v_proj”, “k_proj”],
modules_to_save=[“embed_tokens”, “lm_head”],

What is the reason behind the model not being able to learn the embeddings of the newly added tokens?

hey, did you manage to solve this?

I have my tokens in a list and use tokenizer.add_tokens(new_tokens) instead and it works properly.