I fine tuned codellama using PEFT, although I added some custom tokens and also a special token for padding. So instead of the original token vocab size of 32016, the adapter was trained using a slightly larger vocab of 32023. It seemed to work correctly after training. However, when I save it (trainer.model.save_pretrained(...) ) and reload it (AutoPeftModelForCausalLM.from_pretrained(...) ), I get this error:
RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM: size mismatch for base_model.model.model.embed_tokens.modules_to_save.default.weight: copying a param with shape torch.Size([32023, 5120]) from checkpoint, the shape in current model is torch.Size([32016, 5120]).
hey @thenatefisher
in order to load a model where you have changed token embeddings, lm head, you need to:
Add embeddings as a lora layer so it will be finetuned as well
Add embeddings as part of modules_to_save list in lora config
Make sure adapters are saved as checkpoints
Load base model again and add adapter to it: notice model_id is the saved adapter checkpoint
hope this helps!
# Free memory for loading base model and merging adapter weights
del model
del trainer
torch.cuda.empty_cache()
# load base model and reshape embedding
model = AutoModelForCausalLM.from_pretrained(**model_params)
model.resize_token_embeddings(len(tokenizer))
# Merge Model with Adapter
model = PeftModel.from_pretrained(model=model, model_id="sft_llm/checkpoint-30")```
from trl import setup_chat_format
_, tokenizer = setup_chat_format(base_model, tokenizer)
Check len(tokenizer) before and after the above code block and see the difference.
You can then resize the base model token embeddings using the new tokenizer