Hi there,
I’m currently working with the huggingface framework to train a binary classifier. I saved the newly trained model and wanted to use the checkpoint for incremental learning (what I basically want to do is to retrain my model with new data). I use
model = AutoModelForSequenceClassification.from_pretrained('path_to_model/checkpoint-500', num_labels=2)
to load the model and
trainer.train(resume_from_checkpoint=True)
to train it. But regardless of the size of the new labeled data, the model runs super fast (I don’t use GPU but normal CPU - so this cannot explain the speed). It should take a few minutes (I have a small data size for the trial run) but it seems to be finished within seconds. I read through many GitHub issues but people observe bad results or slow performance when reloading their models but not such an increase in speed (Saving and reloading DistilBertForTokenClassification fine-tuned model · Issue #8272 · huggingface/transformers · GitHub).
(I tried it both in a Jupyter notebook and a Python script but keep observing the same issue)
Here’s the output:
Loading model from *model_path*.
The following columns in the training set don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: __index_level_0__, window_text, document_id. If __index_level_0__, window_text, document_id are not expected by `DistilBertForSequenceClassification.forward`, you can safely ignore this message.
*path*: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
warnings.warn(
***** Running training *****
Num examples = 167
Num Epochs = 2
Instantaneous batch size per device = 16
Total train batch size (w. parallel, distributed & accumulation) = 16
Gradient Accumulation steps = 1
Total optimization steps = 22
Continuing training from checkpoint, will skip to saved global_step
Continuing training from epoch 45
Continuing training from global step 500
0%| | 0/22 [00:00<?, ?it/s]
Training completed. Do not forget to share your model on huggingface.co/models =)
I am grateful for any ideas or recommendations Thank you!