Fine-tuned model for regression is missing output layer after saving to disk

thecity2 · March 5, 2021, 5:54pm

So I have tried to fine-tune distilbert for regression task (using num_labels=1) and it seemed to work. But after saving it to disk (model.save_pretrained(f"checkpoints/model_epoch_{epoch}")) and loading it again and doing inference on a sample piece of text, it is outputting a 768-dimensional vector instead of a single number:

text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)

odict_values([tensor([[[-0.0013,  0.0024,  0.0388,  ...,  0.0087,  0.0316,  0.0316],
         [ 0.0128,  0.0046,  0.0446,  ...,  0.0043,  0.0132,  0.0331],
         [ 0.0124,  0.0069,  0.0430,  ...,  0.0060,  0.0124,  0.0369],
         ...,
         [ 0.0167,  0.0159,  0.0357,  ...,  0.0059,  0.0145,  0.0299],
         [ 0.0139,  0.0140,  0.0340,  ...,  0.0076,  0.0157,  0.0298],
         [ 0.0144,  0.0284,  0.0265,  ...,  0.0117,  0.0108,  0.0268]]],
       grad_fn=<NativeLayerNormBackward>)])

Not sure what I’m doing wrong here.

sgugger · March 8, 2021, 2:12pm

What command did you use to reload your model?

thecity2 · March 8, 2021, 6:04pm

@sgugger model = DistilBertModel.from_pretrained('/path/to/folder/with/config.json')

sgugger · March 8, 2021, 8:15pm

So that’s normal, you are not using the right architecture. You should be using DistilBertForSequenceClassification.

thecity2 · March 8, 2021, 9:53pm

@sgugger Fantastic! Thanks for catching that.

Topic		Replies	Views
How to use the fine-tuned model for actual prediction after re-loading it Beginners	5	14479	August 10, 2022
Need help to give inputs to my fine tuned model Beginners	1	328	December 7, 2021
How to yield hidden_states from a saved, fine-tuned (distil)bert model? 🤗Transformers	2	401	July 12, 2020
How to save a pretrained model after finetuning? Beginners	1	1186	May 7, 2023
Fine-tuning DistilGPT2 on custom data, training Accuracy 100%, output is garbage 🤗Transformers	4	2065	January 31, 2024

Fine-tuned model for regression is missing output layer after saving to disk

Related topics