Trainer gives error after 1st epoch when using F1 score

Hi,

I am trying to finetune a BERT model on a custom dataset with 3 classes. I’ve followed theFine-tune a pretrained model tutorial and slightly adapted it to my needs.

However if I run it, I’m getting the following error after the first epoch:

{'eval_loss': 0.847770631313324, 'eval_f1': array([0.  , 0.88, 0.  ]), 'eval_runtime': 0.02, 'eval_samples_per_second': 699.017, 'eval_steps_per_second': 99.86, 'epoch': 1.0}
TypeError: Object of type ndarray is not JSON serializable
Full Stacktrace
{'eval_loss': 0.847770631313324, 'eval_f1': array([0.  , 0.88, 0.  ]), 'eval_runtime': 0.02, 'eval_samples_per_second': 699.017, 'eval_steps_per_second': 99.86, 'epoch': 1.0}
Model weights saved in ../../base_dir/finetuned_gnd_local/checkpoint-31/pytorch_model.bin
tokenizer config file saved in ../../base_dir/finetuned_gnd_local/checkpoint-31/tokenizer_config.json
Special tokens file saved in ../../base_dir/finetuned_gnd_local/checkpoint-31/special_tokens_map.json
 Traceback (most recent call last):
  File ".../finetune_bert.py", line 222, in <module>
    finetune_bert(data_path=args.data_path,
  File ".../finetune_bert.py", line 216, in finetune_bert
    trainer.train()
  File ".../venv/lib/python3.10/site-packages/transformers/trainer.py", line 1498, in train
    return inner_training_loop(
  File ".../venv/lib/python3.10/site-packages/transformers/trainer.py", line 1832, in _inner_training_loop
    self._maybe_log_save_evaluate(tr_loss, model, trial, epoch, ignore_keys_for_eval)
  File ".../venv/lib/python3.10/site-packages/transformers/trainer.py", line 2042, in _maybe_log_save_evaluate
    self._save_checkpoint(model, trial, metrics=metrics)
  File ".../venv/lib/python3.10/site-packages/transformers/trainer.py", line 2173, in _save_checkpoint
    self.state.save_to_json(os.path.join(output_dir, TRAINER_STATE_NAME))
  File ".../venv/lib/python3.10/site-packages/transformers/trainer_callback.py", line 97, in save_to_json
    json_string = json.dumps(dataclasses.asdict(self), indent=2, sort_keys=True) + "\n"
  File "/usr/lib/python3.10/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/usr/lib/python3.10/json/encoder.py", line 201, in encode
    chunks = list(chunks)
  File "/usr/lib/python3.10/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/usr/lib/python3.10/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/usr/lib/python3.10/json/encoder.py", line 325, in _iterencode_list
    yield from chunks
  File "/usr/lib/python3.10/json/encoder.py", line 405, in _iterencode_dict
    yield from chunks
  File "/usr/lib/python3.10/json/encoder.py", line 438, in _iterencode
    o = _default(o)
  File "/usr/lib/python3.10/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type ndarray is not JSON serializable

The problem only occurs when I use the F1 score as a metric with average=None

So I believe this comes down to a bit of a misunderstanding on my part. I’m assuming the compute_metrics() function is used for the mods to know how it’s performing, therefore it doesn’t like getting an array of per-class F1 scores?
If that is the the case, could someone explain to me, or point me towards a guide on how else I should properly log the training results that interest me. In my case I would like to print the per-class F1 scores so I can understand which classes my model is struggling with though maybe I’ll also be interested in some other metrics in the future.

Could anyone shine a light on this please?

1 Like