Can we add extra word embedding to the BERT?

rohanaiml · February 21, 2022, 1:28am

Hi Everyone,

I am trying to add a tfidf-weighted word2vec embedding to the BERT input as an experiment. To generate text using bert_to_bert or BERT encoder with other transformer decoder.

Thanks,
Rohan

beneyal · February 21, 2022, 2:19am

Hello!

I’m not sure I 100% understand what you’re trying to do.

If what you want is to add another embedding matrix to the existing word embedding matrix, you can do this:

import torch

from transformers import BertModel

model = BertModel.from_pretrained("bert-base-uncased")
with torch.no_grad():
    model.embeddings.word_embeddings.add_(word2vec_matrix)

If you want another embedding layer, then it’s a bit more complicated. You will need to copy the BertEmbeddings layer code from here and add the new layer there. Once you’ve done that, you can just switch the whole module and re-set the pre-trained weights:

with torch.no_grad():
    word_embeddings = torch.clone(model.embeddings.word_embeddings.weight)
    pos_embeddings = torch.clone(model.embeddings.position_embeddings.weight)
    token_type_embeddings = torch.clone(model.embeddings.token_type_embeddings.weight)

config = AutoConfig.from_pretrained("bert-base-uncased")
model.embeddings = AugmentedBertEmbeddings(config)

with torch.no_grad():
    model.embeddings.word_embeddings.weight.set_(word_embeddings)
    model.embeddings.position_embeddings.weight.set_(pos_embeddings)
    model.embeddings.token_type_embeddings.weight.set_(token_type_embeddings)

I hope I haven’t made any mistakes, and that I managed to help

rohanaiml · February 26, 2022, 3:06am

Thanks @beneyal. This seems complicated as I may need to re-train everything and it is complex. Can we combine BERT embedding and my own embedding before passing it to the decoder?

beneyal · March 2, 2022, 12:34pm

BERT doesn’t have a decoder, so I’m not sure what you’re referring to.

hozer · August 4, 2022, 8:44pm

Hi @beneyal, I am planning to pre-train BERT with an extra input embedding( For e.g. one input for Wordpiece tokenizer, one for BPE ) Can I make it with the same method you explained? Actually, I am planning to add some syntactic information in the second embedding. Does it make sense?

mobassir · August 4, 2022, 10:19pm

hello,
i think my question here How to concat laserembeddings with huggingface funnel transformers simple CLS output for fine tuning on downstream NLP sequence classification data problem? is similar?
can you please help? thanks in advance.

mobassir · August 15, 2022, 6:12pm

your code doesn’t work.

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/tmp/ipykernel_17/1823014925.py in <module>
      8 model = BertModel.from_pretrained("bert-base-uncased")
      9 with torch.no_grad():
---> 10     model.embeddings.word_embeddings.add_(fold0_laser)
     11 # model.deberta.embeddings.word_embeddings

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in __getattr__(self, name)
   1184                 return modules[name]
   1185         raise AttributeError("'{}' object has no attribute '{}'".format(
-> 1186             type(self).__name__, name))
   1187 
   1188     def __setattr__(self, name: str, value: Union[Tensor, 'Module']) -> None:

AttributeError: 'Embedding' object has no attribute 'add_'

Topic		Replies	Views
Generate raw word embeddings using transformer models like BERT for downstream process Beginners	9	38634	October 4, 2021
How to give weight to a word in sentence embedding by Bret? 🤗Transformers	0	542	November 15, 2022
How to add a new token and assign corresponding weights for all layers for BERT model? Models	0	645	October 10, 2022
Get embedding from finetuned BertForSequenceClassification model 🤗Transformers	1	3628	February 19, 2022
Understanding how to implement custom BERT model Beginners	0	478	November 22, 2021

Can we add extra word embedding to the BERT?

Related topics