Hi,
I am trying to add some extra tokens to the tokenizer of the peft model.
If I resize the base model
config = PeftConfig.from_pretrained(model_name)
base_model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path)
tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer.add_special_tokens({‘pad_token’: ‘’})
base_model.resize_token_embeddings(len(tokenizer))
model = PeftModel.from_pretrained(base_model, model_name)
I get the size mismatch error because the embedding sizes are different.
If I load the original PeftModel like this and resize:
model = AutoPeftModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer.add_special_tokens({‘pad_token’: ‘’})
model.resize_token_embeddings(len(tokenizer))
This fails because of the Type error: TypeError: Old embeddings are of type <class ‘peft.utils.other.ModulesToSaveWrapper’>, which is not an instance of <class ‘torch.nn.modules.sparse.Embedding’>. You should either use a different resize function or make sure that old_embeddings
are an instance of <class ‘torch.nn.modules.sparse.Embedding’>.
What is the right approach here ?
Thanks
Mohan
1 Like
One can manually adjust the input and output layer for resizing like this:
out = model.get_output_embeddings()
old_tokens, dim = out.weight.size()
new_embeddings = nn.Embedding(
old_tokens + num_new_tokens, dim
)
new_embeddings.weight.data[:old_tokens, :] = out.weight.data[:old_tokens, :]
model.set_output_embeddings(new_embeddings)
We can do the same for get_input_embeddings()/set_input_embeddings() so that adding num_new_tokens will be adjusted for. I have not tested this yet. Is this the right approach ?
Thanks
Mohan
1 Like
Hi Mohan,
It seems like you’re encountering issues with resizing the token embeddings of a PEFT model after adding extra tokens to the tokenizer. Here’s a potential approach to resolve the error you’re facing:
-
Resizing for PEFT Models:
The problem arises because PEFT models might have a different architecture, and resizing embeddings isn’t directly compatible with the typical approach used in standard models.
Instead of trying to directly modify the embedding size of the PEFT model, a better solution might be to ensure that the tokenizer and the base model’s tokenizer sizes are aligned before applying the PEFT wrapper.
-
Steps to Fix the Issue:
- First, resize the tokenizer and base model before applying PEFT.
- Then, ensure that the PEFT model is initialized with the resized tokenizer.
Here’s an updated approach:
# Load the base model and tokenizer
config = PeftConfig.from_pretrained(model_name)
base_model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path)
tokenizer = AutoTokenizer.from_pretrained(model_name)
# Add special tokens to the tokenizer
tokenizer.add_special_tokens({'pad_token': ''})
# Resize the token embeddings of the base model
base_model.resize_token_embeddings(len(tokenizer))
# Now load the PEFT model with the resized base model
model = PeftModel.from_pretrained(base_model, model_name)
# After initializing the PEFT model, resize the token embeddings again (if needed)
model.resize_token_embeddings(len(tokenizer))
resize_token_embeddings
is applied on the base model first to match the tokenizer size.
- Then, the PEFT model is initialized with the resized base model.
- If you face further issues, you may need to explore specific methods within the PEFT library or consult the official PEFT documentation for proper handling of token resizing.
Let me know if you still run into problems or need more clarification!
Good luck!
1 Like
Hi,
If you look at my code snippet, i did exactly that. In this line:
model = PeftModel.from_pretrained(base_model, model_name)
base_model has a different size than model_name (differs by a size of 1 due to the pad token). This fails at this stage.
That’s why explicitly modified the input and output embeddings. I will search more. Let me know if my approach would work.
Thanks
Mohan
1 Like