Hi
I am trying to create a custom model using the transforms
library, specifically the ViT model.
So to have some benchmark, my goal is to replicate the ViTForImageClassification
class, only adding a linear layer on top of the ViTModel
output.
But I experience some problems during the evaluation phase when I used my ViTCustom
model. If I set the argument evaluation_stage="no"
inside the function TrainingArguments(),
the model update the parameters, and the training loss starts to decrease and reach similar levels that comparing with the ViTForImageClassification
(good, sanity check). But, when I set evaluation_strategy="steps"
the model returns the following error :
/usr/local/lib/python3.7/dist-packages/transformers/optimization.py:310: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
FutureWarning,
***** Running training *****
Num examples = 1451
Num Epochs = 4
Instantaneous batch size per device = 16
Total train batch size (w. parallel, distributed & accumulation) = 16
Gradient Accumulation steps = 1
Total optimization steps = 364
[101/364 00:30 < 01:20, 3.28 it/s, Epoch 1.10/4]
Step Training Loss Validation Loss
***** Running Evaluation *****
Num examples = 170
Batch size = 8
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-59-5bdd51e9c139> in <module>
----> 1 train_results = trainer.train()
2 trainer.save_model()
3 trainer.log_metrics("train", train_results.metrics)
4 trainer.save_metrics("train", train_results.metrics)
5 trainer.save_state()
8 frames
/usr/local/lib/python3.7/dist-packages/transformers/trainer_pt_utils.py in nested_detach(tensors)
158 if isinstance(tensors, (list, tuple)):
159 return type(tensors)(nested_detach(t) for t in tensors)
--> 160 return tensors.detach()
161
162
AttributeError: 'NoneType' object has no attribute 'detach'
Does anyone that faced some similar issue have an idea why the ViTCustom
model returns a âNoneTypeâ on evaluation mode that raises an error trying to detach the tensors?
best,
Cristóbal