Hi
I have added new tokens to my tokenizer and I would like to freeze the weights of the original set of tokens in the input embedding layer while allowing the weights of the new tokens to be trained. I’ve tried
existing_vocab = tokenizer.get_vocab()
for token_id in existing_vocab.values():
if token_id < tokenizer.vocab_size - len(new_tokens):
embedding = model.get_input_embeddings().weight[token_id]
embedding = embedding.detach()
embedding.requires_grad = False
But subsequently, when I check the weights of the new tokens, they are not frozen.
Any ideas how I can work around this?