Google’s Official BERT
If they are same/equivalent, how are they created?
- Are the pretrained parameters just copied?
- Or are they pretrained from scratch following exact same approach (objective/hyperparameters) and data mention in the paper?
I have been searching for this information for hours, but can’t find anywhere.
They have the same parameters. As you said the parameters are copied/converted. You’ll find that the repository contains a lot of conversion scripts, to convert between PyTorch and Tensorflow. For instance this one: transformers/convert_bert_original_tf2_checkpoint_to_pytorch.py at cd56f3fe7eae4a53a9880e3f5e8f91877a78271c · huggingface/transformers · GitHub
You’ll find more such scripts here: Search · convert pytorch tensorflow · GitHub