Error when fine-tuning imdb with the script

Hi,

I’m using the run_glue.py script I found here to fine tune imdb.
I follow the example, training looks fine, eval looks fine but there is no result at the end.

python run_glue.py \
  --model_name_or_path bert-base-cased \
  --dataset_name imdb  \
  --do_train \
  --do_predict \
  --max_seq_length 128 \
  --per_device_train_batch_size 4 \
  --learning_rate 2e-5 \
  --num_train_epochs 1 \
  --output_dir tmp/imdb/

I tried to add --remove_unused_columns False :

python run_glue.py \
  --model_name_or_path bert-base-cased \
  --dataset_name imdb  \
  --do_train \
  --do_predict \
  --max_seq_length 128 \
  --per_device_train_batch_size 4 \
  --learning_rate 2e-5 \
  --num_train_epochs 1 \
  --remove_unused_columns False \
  --output_dir tmp/imdb/

I’m getting this error at the end for each attempt :

10/29/2021 08:07:12 - INFO - __main__ - ***** Predict results None *****
[INFO|modelcard.py:449] 2021-10-29 08:07:12,801 >> Dropping the following result as it does not have all the necessary fields:
{'task': {'name': 'Text Classification', 'type': 'text-classification'}, 'dataset': {'name': 'imdb', 'type': 'imdb', 'args': 'plain_text'}}

I don’t really know what is happening because there is no crash.
Transformers, datasets and tokenizers are up to date.

Thank you.

You need to adapt the script a bit for it to work on the IMDB dataset. It has no “validation” set (the set is named “test”) so you have to change the name of that column.