Is Google's official BERT model and huggingface BERT model different or same?

Google’s Official BERT
HuggingFace BERT

If they are same/equivalent, how are they created?

  • Are the pretrained parameters just copied?
  • Or are they pretrained from scratch following exact same approach (objective/hyperparameters) and data mention in the paper?

I have been searching for this information for hours, but can’t find anywhere.

1 Like

They have the same parameters. As you said the parameters are copied/converted. You’ll find that the repository contains a lot of conversion scripts, to convert between PyTorch and Tensorflow. For instance this one: transformers/convert_bert_original_tf2_checkpoint_to_pytorch.py at cd56f3fe7eae4a53a9880e3f5e8f91877a78271c · huggingface/transformers · GitHub

You’ll find more such scripts here: Search · convert pytorch tensorflow · GitHub