Hi @oliverguhr Which solution worked for you for binary classification?
Thank you! This fixed my problem too!
It was weird that this didnât work:
AutoModelForSequenceClassification.from_pretrained(âhuggingface/CodeBERTa-language-idâ, num_labels=15)
but this did:
config = AutoConfig.from_pretrained(âhuggingface/CodeBERTa-language-idâ)
config.num_labels = 15
model = AutoModelForSequenceClassification.from_config(config)
Answering to tolgayan, the point is that this gets trained, youâre fine tuning the model.
I think that this is the simplest and intuitive one. Why did nobody like this?
I find the solution by @nielsr i.e adding the parameter ignore_mismatched_sizes
the most elegant and simple one. It also explains what happens within the code.
Hi @carlosaguayo,
Initializing a model from a config will randomly initialize all the weights of the model. To use the pre-trained weights and add a new, randomly initialized head on top, you would need to do:
from transformers import AutoModelForSequenceClassification
model = AutoModelForSequenceClassification.from_pretrained(âhuggingface/CodeBERTa-language-idâ, num_labels=15, ignore_mismatched_sizes=True)
Simple but Best Solution it solves all everything.
I used the âlabel_namesâ argument on my trainer to define which labels I wanted and not the default âlabelsâ choice.
On the trainer, I set ânum_labels=6â and âignore_mismatched_sizes=Trueâ appropiately, however, when doing trainer.train() I get the following error:
TypeError: forward() got an unexpected keyword argument âcohesionâ
(My 6 labels are [âcohesionâ, âsyntaxâ, âvocabularyâ, âphraseologyâ, âgrammarâ, âconventionsâ])
How would I fix this? Thanks in advance!
EDIT: I fixed this passing down a label matrix as labels instead of using label_names on training args, but if someone knows how to properly use that trainer argument Id appreciate it
In fact, you can custom the pre-trained model by change its layer. For instance, I use BertModelForSequenceClassfication for classification task.
from transformers import BertTokenizer, BertForSequenceClassification, AutoModelForSequenceClassification
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = AutoModelForSequenceClassification.from_pretrained('bert-base-uncased')
model.to(device)
However, I want to change its classification head, I would do this
cp_model = deepcopy(model)
cp_model.classifier = nn.Sequential(
(nn.Linear(768, 526)),
nn.Dropout(0.1),
nn.Dropout(0.1),
(nn.Linear(526, 258)),
nn.ReLU(),
nn.Dropout(0.1),
(nn.Linear(258, 2)),
nn.Softmax()
)
cp_model.to(device)
In fact, you can do directly on the model, but I want to make a copy because I do not want to change any thing on the base model (it just personal).
Hello @nielsr
I hope you are well. I am fine tunning the gpt-neo and to overcome the overfitting I want to increase the drop out to 0.2. if I do this by applying this current command, can I use the model for fine tunning directly with my own dataset?
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("gpt-neo")
model = AutoModelForMaskedLM.from_pretrained("gpt-neo",embed_dropou=0.2,resid_dropout=0.2,attention_dropout=0.2, )
Since, using num_lables=2
replaces the existing classifier head with a newly initialized head, right! In the code you have given for building a custom classifier head, will the existing classifier be retained? If yes, how to remove that and ensure that it has no effect? Or is it fine that is retained?
Yes you can, note that the from_pretrained method puts your model in evaluation mode by default, so you would have to call model.train()
before training (the Trainer automatically takes care of that).
And how you can load it ? I got an error when load it, it didnât load a edited model.