I have a single script to perform fine-tuning - if required - and test. In the test I use predict
setting predict_with_generate=True
.
I use a Trainer instance for doing so, more specifically the test is done with trainer.predict()
.
The problem is when I want to generate texts with 0 or k-shot prompting and no fine-tuning. In that case, the training set is empty and I get an error from the Trainer even if I specify do_train=False
in its training arguments.
Using Trainer.predict()
is very convenient because it takes care of all the start/end tokens the specific model expects. If I had to modify my code for generating text, I would need to take care of too many little details.
Is there a way to tell the Trainer to simply ignore the training set if the .train()
method is never called?
1 Like
When I asked Hugging Chat, I got the following response. I wonder if it really works…?
To resolve the issue where the Huggingface Trainer requires a training or evaluation dataset when using trainer.predict()
, follow these steps:
-
Set Training Flags: Ensure do_train=False
and do_eval=False
in your TrainingArguments
to bypass training and evaluation.
-
Pass Empty Datasets: Provide None
for both train_dataset
and eval_dataset
when initializing the Trainer. This prevents the Trainer from attempting to use non-existent datasets.
-
Adjust Evaluation Strategy: Optionally, set eval_strategy='no'
to disable any evaluation during prediction.
By implementing these steps, you can use trainer.predict()
without encountering errors related to missing datasets.
Answer:
You can configure the Trainer to ignore the training and evaluation datasets by setting do_train=False
, do_eval=False
, and passing None
for both datasets. This allows you to use trainer.predict()
without any issues [1][2].
from transformers import TrainingArguments
training_args = TrainingArguments(
output_dir="./results",
do_train=False,
do_eval=False,
do_predict=True,
eval_strategy='no' # Optional: Disable evaluation during prediction
)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=None,
eval_dataset=None,
)
trainer.predict(test_dataset)
This setup ensures the Trainer doesn’t require any datasets for training or evaluation, allowing your prediction task to proceed smoothly [1][2].