Correct way to define outputs for an Image Model

nishantb21 · July 17, 2022, 12:12am

Hey! I am making a Image AutoEncoder of type PreTrainedModel so that its compatible with the Trainer class. I understand that the output should be in a specific format so the trainer can automatically infer the output but I don’t understand is what format I should adopt. For e.g. I am feeding in images of shape batch_size x 3 x 256 x 256 and my output is another tensor with the same dimensions. So I output the loss and logits as part of a dictionary. While that works during the training phase, the trainer fails during the evaluation phase and gives me the following error:

  9%|▉         | 29/320 [00:27<01:59,  2.44it/s]Could not estimate the number of tokens of the input, floating-point operations will not be computed
  9%|▉         | 30/320 [00:27<01:57,  2.46it/s]Could not estimate the number of tokens of the input, floating-point operations will not be computed
 10%|▉         | 31/320 [00:28<01:57,  2.46it/s]Could not estimate the number of tokens of the input, floating-point operations will not be computed
 10%|█         | 32/320 [00:28<01:32,  3.13it/s]***** Running Evaluation *****
  Num examples = 200
  Batch size = 32

  0%|          | 0/7 [00:00<?, ?it/s]Traceback (most recent call last):
  File "C:\Users\nisha\Documents\Imagine\main.py", line 60, in <module>
    main()
  File "C:\Users\nisha\Documents\Imagine\main.py", line 52, in main
    atrain(args.dataset, args.subdatasets)
  File "C:\Users\nisha\Documents\Imagine\autoencoder_trainer.py", line 91, in train
    train_single_asset(subdataset, dataset_tag)
  File "C:\Users\nisha\Documents\Imagine\autoencoder_trainer.py", line 75, in train_single_asset
    trainer.train()
  File "C:\Users\nisha\.conda\envs\imagine\lib\site-packages\transformers\trainer.py", line 1455, in train
    self._maybe_log_save_evaluate(tr_loss, model, trial, epoch, ignore_keys_for_eval)
  File "C:\Users\nisha\.conda\envs\imagine\lib\site-packages\transformers\trainer.py", line 1565, in _maybe_log_save_evaluate
    metrics = self.evaluate(ignore_keys=ignore_keys_for_eval)
  File "C:\Users\nisha\.conda\envs\imagine\lib\site-packages\transformers\trainer.py", line 2208, in evaluate
    output = eval_loop(
  File "C:\Users\nisha\.conda\envs\imagine\lib\site-packages\transformers\trainer.py", line 2394, in evaluation_loop
    preds_host = logits if preds_host is None else nested_concat(preds_host, logits, padding_index=-100)
  File "C:\Users\nisha\.conda\envs\imagine\lib\site-packages\transformers\trainer_pt_utils.py", line 106, in nested_concat
    return type(tensors)(nested_concat(t, n, padding_index=padding_index) for t, n in zip(tensors, new_tensors))
  File "C:\Users\nisha\.conda\envs\imagine\lib\site-packages\transformers\trainer_pt_utils.py", line 106, in <genexpr>
    return type(tensors)(nested_concat(t, n, padding_index=padding_index) for t, n in zip(tensors, new_tensors))
  File "C:\Users\nisha\.conda\envs\imagine\lib\site-packages\transformers\trainer_pt_utils.py", line 108, in nested_concat
    return torch_pad_and_concatenate(tensors, new_tensors, padding_index=padding_index)
  File "C:\Users\nisha\.conda\envs\imagine\lib\site-packages\transformers\trainer_pt_utils.py", line 69, in torch_pad_and_concatenate
    if len(tensor1.shape) == 1 or tensor1.shape[1] == tensor2.shape[1]:

Upon inspection it seems to be that the output doesn’t match what the trainer expects. I should probably mention that since I have an AutoEncoder model, I don’t have an explicit label since the input is the label. Any help would be much appreciated!

Topic		Replies	Views
How to use the multiple output of the model while calling Trainer 🤗Transformers	0	488	August 10, 2021
How to create and use my own ModelOutput class with Trainer 🤗Transformers	0	383	January 27, 2021
Possible fix for trainer evaluation with object detection 🤗Transformers	0	318	February 7, 2024
EvalPrediction returning one less prediction than label id for each batch Beginners	7	6113	June 19, 2024
Output of PenultimateLayer 🤗Transformers	0	143	January 26, 2023

Correct way to define outputs for an Image Model

Related topics