So I have tried to fine-tune distilbert
for regression task (using num_labels=1
) and it seemed to work. But after saving it to disk (model.save_pretrained(f"checkpoints/model_epoch_{epoch}")
) and loading it again and doing inference on a sample piece of text, it is outputting a 768-dimensional vector instead of a single number:
text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)
odict_values([tensor([[[-0.0013, 0.0024, 0.0388, ..., 0.0087, 0.0316, 0.0316],
[ 0.0128, 0.0046, 0.0446, ..., 0.0043, 0.0132, 0.0331],
[ 0.0124, 0.0069, 0.0430, ..., 0.0060, 0.0124, 0.0369],
...,
[ 0.0167, 0.0159, 0.0357, ..., 0.0059, 0.0145, 0.0299],
[ 0.0139, 0.0140, 0.0340, ..., 0.0076, 0.0157, 0.0298],
[ 0.0144, 0.0284, 0.0265, ..., 0.0117, 0.0108, 0.0268]]],
grad_fn=<NativeLayerNormBackward>)])
Not sure what I’m doing wrong here.