I’m trying to fine-tune a DistilBert model on a new dataset and when I attempt to train I get this:
ValueError: The model did not return a loss from the inputs, only the following keys: last_hidden_state. For reference, the inputs it received are input_ids,attention_mask.
Why??
I’ve followed Huggingface tutorials all the way along.
As far as I can tell, my dataset is in the right shape:
DatasetDict({
train: Dataset({
features: ['text', 'labels'],
num_rows: 44330
})
test: Dataset({
features: ['text', 'labels'],
num_rows: 11083
})
})
After passing it through tokenizer = DistilBertTokenizer.from_pretrained(‘distilbert-base-cased’), it looks like this:
DatasetDict({
train: Dataset({
features: ['text', 'labels', 'input_ids', 'attention_mask'],
num_rows: 44330
})
test: Dataset({
features: ['text', 'labels', 'input_ids', 'attention_mask'],
num_rows: 11083
})
})
I cannot find any solutions, or even guidance as to how to debug this, anywhere.
A little help would be awesome.