Merging in successive loras to a base model

I have multiple loras that I’ve trained on a base model and I’ve been attempting to successively merge them into the base model, however there appears to be some issue with the resulting model. Specifically, I believe the issue is with the tokenizer, but I’m uncertain how to fix this issue, as my understanding is that the tokenizer should simply be the same as the base model.

When I load the base model into text-generation-webui and apply all three loras, I get good generations with the expected behavior, however, after merging them together with the below code I get gibberish that seems very clearly to be applying incorrect tokens.


model = AutoModelForCausalLM.from_pretrained(
    "unsloth/Mistral-Small-Instruct-2409-bnb-4bit"
)
tokenizer = AutoTokenizer.from_pretrained(
    "unsloth/Mistral-Small-Instruct-2409-bnb-4bit"
)

model = PeftModel.from_pretrained(model, "hobie-lora4")
model = model.merge_and_unload()
model.save_pretrained("HobieLLMPart4")
tokenizer.save_pretrained("HobieLLMPart4")
model = PeftModel.from_pretrained(model, "hobie-lora5")
model = model.merge_and_unload()
model.save_pretrained("HobieLLMPart5")
tokenizer.save_pretrained("HobieLLMPart5")
model = PeftModel.from_pretrained(model, "hobie-lora6")
model = model.merge_and_unload()
model.save_pretrained("HobieLLMPart6")
tokenizer.save_pretrained("HobieLLMPart6")
1 Like

I’ve never tried this myself, so this may not be entirely correct, but it seems to me that while the first merge (with hobie-lora4) would work, the subsequent merges would not.

The PeftModel.from_pretrained(model, directory) function may not be working as expected because of the type differences between models initialized with the AutoModelForCausalLM.from_pretrained() function and models resulting from the model.merge_and_unload() function.

1 Like

If the different LoRA adapters were all trained on the base model, it would not be surprising that you cannot just merge all of them together. After you merge the first adapter, the new base model (i.e. old base model + first LoRA) is different from what the 2nd adapter was trained on, so it doesn’t work as expected.

So either, you need to train the 2nd LoRA based on the merged 1st LoRA, etc. Or you try the add_weighted_adapter method, which combines multiple LoRA adapters into a single one: LoRA.

1 Like