What does from_pretrained do?

mkabra · December 19, 2022, 3:31pm

Hi, I am trying to visualize what from_pretrained do
For eg:

from transformers import BertModel
finbert_bertmodel=BertModel.from_pretrained('ProsusAI/finbert')

#The Bert Model transformer outputting raw hidden-states without any specific head on top.
#While Finbert has BertForSequenceClassification architecture

This is my understanding of from_pretrained for this piece of code. Here, it loads all the corresponding weights of Finbert into the architecture of BertModel.
Similarly, for a different model, say:

finbert_maskedLM=AutoModelForMaskedLM.from_pretrained('ProsusAI/finbert')

Weights of Finbert will be loaded into the architecture of MaskedLM, the weights of last layer of MaskedLM will be randomly initialised as Finbert architecture is of Sequence classification, and not of MaskedLM.
Is my understanding correct?
Please add additional details and any link which has additional information.

mkabra · December 20, 2022, 5:01pm

@nielsr @sgugger ,

Can you please check this question ?

nielsr · September 10, 2024, 8:37am

Hi,

So ProsusAI/finbert · Hugging Face indeed has a BertForSequenceClassification architecture as can be seen from the “architectures” field in config.json.

When you do:

from transformers import BertModel
finbert_bertmodel=BertModel.from_pretrained('ProsusAI/finbert')

then what happens is, all weights of the BERT model get initialized from the checkpoint on the hub, and the weights of the sequence classifier on top (which the finbert repo included) will be discarded, as BertModel does not include a sequence classification head on top.

When you do:

from transformers import BertForMaskedLM
finbert_bertmodel=BertForMaskedLM.from_pretrained('ProsusAI/finbert')

then all weights of the BERT model get initialized from the checkpoint on the hub, the sequence classifier weights gets discarded, and a randomly initialized language modeling head on top will be added.

Topic		Replies	Views
Using finBert for 7-class sequence classification Beginners	1	814	June 8, 2022
Why aren't all weights of BertForPreTraining initialized from the model checkpoint? Beginners	3	1588	October 5, 2021
Differences between Config.from_pretrained and Model.from_pretrained 🤗Transformers	1	1106	July 20, 2021
Uninitiallized weights with supposed correct architecture Models	1	330	October 6, 2023
Weights not downloading Beginners	3	1838	May 24, 2021

What does from_pretrained do?

Related topics