Resize embeddings on Peft model

suruti94 · January 21, 2025, 10:33pm

Hi,

I am trying to add some extra tokens to the tokenizer of the peft model.

If I resize the base model

config = PeftConfig.from_pretrained(model_name)
base_model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path)
tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer.add_special_tokens({‘pad_token’: ‘’})
base_model.resize_token_embeddings(len(tokenizer))
model = PeftModel.from_pretrained(base_model, model_name)

I get the size mismatch error because the embedding sizes are different.

If I load the original PeftModel like this and resize:

model = AutoPeftModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer.add_special_tokens({‘pad_token’: ‘’})
model.resize_token_embeddings(len(tokenizer))

This fails because of the Type error: TypeError: Old embeddings are of type <class ‘peft.utils.other.ModulesToSaveWrapper’>, which is not an instance of <class ‘torch.nn.modules.sparse.Embedding’>. You should either use a different resize function or make sure that old_embeddings are an instance of <class ‘torch.nn.modules.sparse.Embedding’>.

What is the right approach here ?

Thanks
Mohan

suruti94 · January 22, 2025, 1:32am

One can manually adjust the input and output layer for resizing like this:

out = model.get_output_embeddings()
old_tokens, dim = out.weight.size()
new_embeddings = nn.Embedding(
old_tokens + num_new_tokens, dim
)
new_embeddings.weight.data[:old_tokens, :] = out.weight.data[:old_tokens, :]
model.set_output_embeddings(new_embeddings)

We can do the same for get_input_embeddings()/set_input_embeddings() so that adding num_new_tokens will be adjusted for. I have not tested this yet. Is this the right approach ?

Thanks
Mohan

Alanturner2 · January 22, 2025, 2:08am

Hi Mohan,

It seems like you’re encountering issues with resizing the token embeddings of a PEFT model after adding extra tokens to the tokenizer. Here’s a potential approach to resolve the error you’re facing:

Resizing for PEFT Models:
The problem arises because PEFT models might have a different architecture, and resizing embeddings isn’t directly compatible with the typical approach used in standard models.

Instead of trying to directly modify the embedding size of the PEFT model, a better solution might be to ensure that the tokenizer and the base model’s tokenizer sizes are aligned before applying the PEFT wrapper.
Steps to Fix the Issue:
- First, resize the tokenizer and base model before applying PEFT.
- Then, ensure that the PEFT model is initialized with the resized tokenizer.

Here’s an updated approach:

# Load the base model and tokenizer
config = PeftConfig.from_pretrained(model_name)
base_model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Add special tokens to the tokenizer
tokenizer.add_special_tokens({'pad_token': ''})

# Resize the token embeddings of the base model
base_model.resize_token_embeddings(len(tokenizer))

# Now load the PEFT model with the resized base model
model = PeftModel.from_pretrained(base_model, model_name)

# After initializing the PEFT model, resize the token embeddings again (if needed)
model.resize_token_embeddings(len(tokenizer))

resize_token_embeddings is applied on the base model first to match the tokenizer size.
Then, the PEFT model is initialized with the resized base model.
If you face further issues, you may need to explore specific methods within the PEFT library or consult the official PEFT documentation for proper handling of token resizing.

Let me know if you still run into problems or need more clarification!

Good luck!

suruti94 · January 22, 2025, 3:31am

Hi,

If you look at my code snippet, i did exactly that. In this line:

model = PeftModel.from_pretrained(base_model, model_name)

base_model has a different size than model_name (differs by a size of 1 due to the pad token). This fails at this stage.

That’s why explicitly modified the input and output embeddings. I will search more. Let me know if my approach would work.

Thanks
Mohan

mrJordi0 · May 12, 2025, 2:47pm

I was wondering if you’ve managed to find a solution or a proper way to handle this? I’m facing the same problem right now and would really appreciate anything you could share.

Thanks in advance!
Jordi

Topic		Replies	Views
Size Mismatch error for LLM checkpoint of PEFT model with a resized token embeddings Models	4	353	January 10, 2025
Loading Peft model from checkpoint leading into size missmatch 🤗Transformers	6	10278	February 7, 2024
Having trouble loading a fine-tuned PEFT model (CodeLlama-13b-Instruct-hf base) 🤗Transformers	2	4310	October 6, 2024
Saving Manually Resized Embeddings for a Pretrained Bert Model (I believe I am asking this correctly) Beginners	0	107	November 7, 2024
Is resize_token_embeddings available to the FlaxPreTrainedModel? Flax/JAX Projects	1	1762	August 25, 2022

Resize embeddings on Peft model

Related topics