Having trouble loading a fine-tuned PEFT model (CodeLlama-13b-Instruct-hf base)

I fine tuned codellama using PEFT, although I added some custom tokens and also a special token for padding. So instead of the original token vocab size of 32016, the adapter was trained using a slightly larger vocab of 32023. It seemed to work correctly after training. However, when I save it (trainer.model.save_pretrained(...) ) and reload it (AutoPeftModelForCausalLM.from_pretrained(...) ), I get this error:

RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM: size mismatch for base_model.model.model.embed_tokens.modules_to_save.default.weight: copying a param with shape torch.Size([32023, 5120]) from checkpoint, the shape in current model is torch.Size([32016, 5120]).

Any way to fix this?

4 Likes

hey @thenatefisher :slight_smile:
in order to load a model where you have changed token embeddings, lm head, you need to:

  1. Add embeddings as a lora layer so it will be finetuned as well
  2. Add embeddings as part of modules_to_save list in lora config
  3. Make sure adapters are saved as checkpoints
  4. Load base model again and add adapter to it: notice model_id is the saved adapter checkpoint

hope this helps!

# Free memory for loading base model and merging adapter weights
del model
del trainer
torch.cuda.empty_cache()

# load base model and reshape embedding
model = AutoModelForCausalLM.from_pretrained(**model_params)
model.resize_token_embeddings(len(tokenizer))

# Merge Model with Adapter
model = PeftModel.from_pretrained(model=model, model_id="sft_llm/checkpoint-30")```
3 Likes
from trl import setup_chat_format
_, tokenizer = setup_chat_format(base_model, tokenizer)

Check len(tokenizer) before and after the above code block and see the difference.
You can then resize the base model token embeddings using the new tokenizer

base_model.resize_token_embeddings(len(tokenizer))