ValueError: Trainer: evaluation requires an eval_dataset

I’m trying to do a finetuning without an evaluation dataset.
For that, I’m using the following code:

    training_args = TrainingArguments(
        output_dir=resume_from_checkpoint,
        evaluation_strategy="epoch",
        per_device_train_batch_size=1,
    )
    def compute_metrics(pred: EvalPrediction):
        labels = pred.label_ids
        preds = pred.predictions.argmax(-1)
        f1 = f1_score(labels, preds, average="weighted")
        acc = accuracy_score(labels, preds, average="weighted")
        return {"accuracy": acc, "f1": f1}
    trainer = Trainer(
        model=self.nli_model,
        args=training_args,
        train_dataset=tokenized_datasets,
        compute_metrics=compute_metrics,
    )

However, I get ValueError: Trainer: evaluation requires an eval_dataset.. I thought that by default, Trainer does no evaluation… at least in the docs, I got this idea…

Hello.

In source code here,
get_eval_dataloader function has this code:

if eval_dataset is None and self.eval_dataset is None:
            raise ValueError("Trainer: evaluation requires an eval_dataset.")

It is enforcing the requirement that an evaluation dataset must be provided for the trainer to perform evaluation. If no evaluation dataset is specified, the code will raise an error to alert the user.

In order to compute the desired metric, the trainer requires the specification of an eval_dataset. This dataset serves as the basis for evaluating the model’s performance.

You can avoid this error by not specifying evaluation_strategy in TrainingArguments.

1 Like