Hi, I am trying to implement Model from Hugging Face transformer library following this Youtube video: Simple Training with the 🤗 Transformers Trainer - YouTube
I have followed along all the steps but for some reason, I get an error when I try to execute the trainer.train() command. This is the error I am getting,
TypeError: Expected sequence or array-like, got <class 'transformers.trainer_utils.EvalPrediction'>
Full error details are as follows,
The following columns in the evaluation set don't have a corresponding argument in `BertForSequenceClassification.forward` and have been ignored: label_text, text. If label_text, text are not expected by `BertForSequenceClassification.forward`, you can safely ignore this message.
***** Running Evaluation *****
Num examples = 2000
Batch size = 64
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-147-3435b262f1ae> in <module>
----> 1 trainer.train()
13 frames
/usr/local/lib/python3.8/dist-packages/sklearn/utils/validation.py in _num_samples(x)
263 x = np.asarray(x)
264 else:
--> 265 raise TypeError(message)
266
267 if hasattr(x, "shape") and x.shape is not None:
TypeError: Expected sequence or array-like, got <class 'transformers.trainer_utils.EvalPrediction'>
Anyone who knows what could be the possible reason behind this error? If needed, you can access my Google Colab Notebook here. The Colab Notebook contains all the code that I did (it is same as the code shown in the video): Google Colab
If you would like to know my training_arguments and compute_metrics function, then I am also sharing them below,
def compute_metrics(pred):
labels = pred.label_ids
preds = pred.predictions.argmax(-1)
f1 = f1_score(labels, pred, average='weighted')
return {"f1": f1}
batch_size = 64
logging_steps = len(emotion_dataset['train'])
output_dir = "minlm-finetuned-emotion"
training_args = TrainingArguments(output_dir = output_dir,
num_train_epochs=5,
learning_rate=2e-5,
per_device_train_batch_size=batch_size,
per_device_eval_batch_size=batch_size,
weight_decay=0.01,
evaluation_strategy="epoch",
logging_steps=logging_steps,
fp16=True,
push_to_hub=True
)
trainer = WeightedLossTrainer(model=model,
args=training_args,
compute_metrics=compute_metrics,
train_dataset=emotion_dataset['train'],
eval_dataset=emotion_dataset['validation'],
tokenizer=tokenizer)
trainer.train()