I am learning huggingface transformer model and looked at this Fine-tuning a pretrained model - Hugging Face Course.
I saw some code samples as below and did some additional google search and realized things like below could be done. I decided to explore the documentation and the google search after searching TFAutoModelForSequenceClassification
returned this Auto Classes
but that page doesn’t explain methods/functions/properties such as model.summary()
or model2.layers[0].trainable
. It doesn’t even mention that there is a parameter num_labels
in the TFAutoModelForSequenceClassification.from_pretrained
. I realized these thing because my other google searches returned those
Is there any better documentation where I could understand such things? I am finding it incredibly difficult to understand what are the different things that I could do
!pip install datasets transformers[sentencepiece]
checkpoint = "bert-base-uncased"
from transformers import TFAutoModelForSequenceClassification
model = TFAutoModelForSequenceClassification.from_pretrained(checkpoint, num_labels=2)
model.summary()
this returns
All model checkpoint layers were used when initializing TFBertForSequenceClassification.
Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Model: "tf_bert_for_sequence_classification_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
bert (TFBertMainLayer) multiple 109482240
dropout_75 (Dropout) multiple 0
classifier (Dense) multiple 1538
=================================================================
Total params: 109,483,778
Trainable params: 109,483,778
Non-trainable params: 0
_________________________________________________________________
Then
# (optional) freeze bert layer
model2=model
model2.layers[0].trainable = False
model2.summary()
returns
Model: "tf_bert_for_sequence_classification_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
bert (TFBertMainLayer) multiple 109482240
dropout_75 (Dropout) multiple 0
classifier (Dense) multiple 1538
=================================================================
Total params: 109,483,778
Trainable params: 1,538
Non-trainable params: 109,482,240
_________________________________________________________________