Are BERT models pretrained with Whole Word Masking?

Boxi · November 27, 2020, 1:22am

Are BERT models in Transformers pretrained with Whole Word Masking?

sgugger · November 30, 2020, 2:03pm

It depends on the checkpoint you are using, we provide both version. For instance bert-base-uncased is the first BERT model pretrained without WWM, but bert-large-uncased-whole-word-masking is pretrained with WWM. Check all the checkpoints available here

Topic		Replies	Views
Is masking still used when finetuning a BERT model? Beginners	1	1322	July 29, 2020
Fine tunning pretrained bert with new vocabulary Beginners	0	449	October 1, 2020
Empty BERT Model, any help? Beginners	2	490	January 5, 2024
How can I see the masked words during pre-learning by MLM? 🤗Transformers	0	252	February 7, 2022
Pretrain own model 🤗Transformers	0	270	October 23, 2023

Are BERT models pretrained with Whole Word Masking?

Related topics