Hello,
Are the weights of the maskedLM output head of the BertForMaskedLM
model pre-trained?
Or are the weights of the maskedLM output head randomly initialized each time the model is called?
Thank you,
Hello,
Are the weights of the maskedLM output head of the BertForMaskedLM
model pre-trained?
Or are the weights of the maskedLM output head randomly initialized each time the model is called?
Thank you,