Thanks for your reply. Because the bert-model is a part of my whole model. I directly saved the whole model. So what is the best practice under this situation ?
Will it be possible for you to create colab for this ?
Also, to take advantage of .from_pretrained and .save_pretrained you can sub-class the BertPretrainedModel and add the additional layers in it. See these task specific bert models, they use bert and additional layer on top of it and subclass BertPreTrainedModel.