When I do
bert_config = BertConfig.from_pretrained(‘bert-base-multilingual-cased’)
model = BertModel(bert_config)
torch.save(model.state_dict(), “temp.p”)
print(‘Size (MB):’, os.path.getsize(“temp.p”)/1e6)
and I save this model my model size is around 711 MB. But if I do the same for ‘bert-base-uncased’ or ‘bert-base-cased’ which is actually expected due to 110M parameters. Is this expected/ anyone know what I might be potentially doing wrong.