How to add additional module to BERT architecture, then load the original weight and use it

realliyifei · May 20, 2022, 12:38pm

I aim to

Add an additional module to BERT architecture. In detail, each layer should cross product a similar but not identical vector, depending on the index of layer and the type of BERT model
Load the BERT’s weight to this new BERT model
Then use BERT directly or continue train BERT

I’m very confused how to do it. Since we usually only load the configuration and weights of model from huggingface directly. This might be done via modeling_bert.py (after hidden_states of each layer), but now can the vector be different depending on different model types?

In more detail, I’m working on both prajjwal1/bert-tiny (now) and bert-base-uncased (next step).

Topic		Replies	Views
Load Bert model weights to transformers v3 from model trained with transformers v2 🤗Transformers	2	301	November 2, 2020
Loading pytorch_pretrained_bert models with transformers Beginners	2	1914	April 29, 2021
How to modify the internal layers of BERT 🤗Transformers	12	16567	July 19, 2023
Differences between Config.from_pretrained and Model.from_pretrained 🤗Transformers	1	1161	July 20, 2021
Modify bert embeddings 🤗Transformers	0	382	January 18, 2022

How to add additional module to BERT architecture, then load the original weight and use it

Related topics