Saving standard BertModel english and BertModel multilingual have drastically different sizes?

mkumar10 · August 28, 2020, 8:04am

When I do

bert_config = BertConfig.from_pretrained(‘bert-base-multilingual-cased’)
model = BertModel(bert_config)
torch.save(model.state_dict(), “temp.p”)
print(‘Size (MB):’, os.path.getsize(“temp.p”)/1e6)

and I save this model my model size is around 711 MB. But if I do the same for ‘bert-base-uncased’ or ‘bert-base-cased’ which is actually expected due to 110M parameters. Is this expected/ anyone know what I might be potentially doing wrong.

valhalla · August 28, 2020, 3:28pm

Hey @mkumar10,
The bert-base-multilingual-cased’ model has large vocab_size (119547), so the embedding matrix is bigger than standard English BERT model. Which is why it takes more memory.

mkumar10 · August 28, 2020, 8:48pm

That makes sense and that’s what I suspected but 270 MB is really substantial so just wanted to confirm if that is the only reason?

Topic		Replies	Views
Saving Manually Resized Embeddings for a Pretrained Bert Model (I believe I am asking this correctly) Beginners	0	109	November 7, 2024
Size of saved model: Is there a way to make it smaller for deploy? Beginners	1	594	July 27, 2023
Storage-efficient ways to store models 🤗Transformers	0	298	July 8, 2023
Is it possible to use a pre-trained Bert model with a modified type_vocab_size parameter? Beginners	0	698	May 12, 2021
What is the input vector size for a BERT and Transformer-XL? 🤗Transformers	1	3551	September 2, 2020

Saving standard BertModel english and BertModel multilingual have drastically different sizes?

Related topics