Hey! I am making a Image AutoEncoder of type PreTrainedModel so that its compatible with the Trainer class. I understand that the output should be in a specific format so the trainer can automatically infer the output but I don’t understand is what format I should adopt. For e.g. I am feeding in images of shape batch_size x 3 x 256 x 256
and my output is another tensor with the same dimensions. So I output the loss
and logits
as part of a dictionary. While that works during the training phase, the trainer fails during the evaluation phase and gives me the following error:
9%|▉ | 29/320 [00:27<01:59, 2.44it/s]Could not estimate the number of tokens of the input, floating-point operations will not be computed
9%|▉ | 30/320 [00:27<01:57, 2.46it/s]Could not estimate the number of tokens of the input, floating-point operations will not be computed
10%|▉ | 31/320 [00:28<01:57, 2.46it/s]Could not estimate the number of tokens of the input, floating-point operations will not be computed
10%|█ | 32/320 [00:28<01:32, 3.13it/s]***** Running Evaluation *****
Num examples = 200
Batch size = 32
0%| | 0/7 [00:00<?, ?it/s]Traceback (most recent call last):
File "C:\Users\nisha\Documents\Imagine\main.py", line 60, in <module>
main()
File "C:\Users\nisha\Documents\Imagine\main.py", line 52, in main
atrain(args.dataset, args.subdatasets)
File "C:\Users\nisha\Documents\Imagine\autoencoder_trainer.py", line 91, in train
train_single_asset(subdataset, dataset_tag)
File "C:\Users\nisha\Documents\Imagine\autoencoder_trainer.py", line 75, in train_single_asset
trainer.train()
File "C:\Users\nisha\.conda\envs\imagine\lib\site-packages\transformers\trainer.py", line 1455, in train
self._maybe_log_save_evaluate(tr_loss, model, trial, epoch, ignore_keys_for_eval)
File "C:\Users\nisha\.conda\envs\imagine\lib\site-packages\transformers\trainer.py", line 1565, in _maybe_log_save_evaluate
metrics = self.evaluate(ignore_keys=ignore_keys_for_eval)
File "C:\Users\nisha\.conda\envs\imagine\lib\site-packages\transformers\trainer.py", line 2208, in evaluate
output = eval_loop(
File "C:\Users\nisha\.conda\envs\imagine\lib\site-packages\transformers\trainer.py", line 2394, in evaluation_loop
preds_host = logits if preds_host is None else nested_concat(preds_host, logits, padding_index=-100)
File "C:\Users\nisha\.conda\envs\imagine\lib\site-packages\transformers\trainer_pt_utils.py", line 106, in nested_concat
return type(tensors)(nested_concat(t, n, padding_index=padding_index) for t, n in zip(tensors, new_tensors))
File "C:\Users\nisha\.conda\envs\imagine\lib\site-packages\transformers\trainer_pt_utils.py", line 106, in <genexpr>
return type(tensors)(nested_concat(t, n, padding_index=padding_index) for t, n in zip(tensors, new_tensors))
File "C:\Users\nisha\.conda\envs\imagine\lib\site-packages\transformers\trainer_pt_utils.py", line 108, in nested_concat
return torch_pad_and_concatenate(tensors, new_tensors, padding_index=padding_index)
File "C:\Users\nisha\.conda\envs\imagine\lib\site-packages\transformers\trainer_pt_utils.py", line 69, in torch_pad_and_concatenate
if len(tensor1.shape) == 1 or tensor1.shape[1] == tensor2.shape[1]:
Upon inspection it seems to be that the output doesn’t match what the trainer expects. I should probably mention that since I have an AutoEncoder model, I don’t have an explicit label since the input is the label. Any help would be much appreciated!