and used my dataset and the results are very good. I am having a hard time know trying to understand how to save the model I trainned and all the artifacts needed to use my model later.
I tried at the end of the tutorial: torch.save(trainer, 'my_model') but I got this error msg:
AttributeError: Can't pickle local object 'get_linear_schedule_with_warmup.<locals>.lr_lambda'
You can save models with trainer.save_model("path_to_save"). Another cool thing you can do is you can push your model to the Hugging Face Hub as well. I added couple of lines to notebook to show you, here. You can find pushing there.
Thank you very much for helping me Merve. Huge Thanks.
Just one more question if you don’t mind: I’ll now use my model locally at first. You helped me to save all the files I need to load it again.
So to use the same model I save with trainer.save_model(path) I just need to use trainer.load(path)?
You can simply load the model using the model class’ from_pretrained(model_path) method like below:
(you can either save locally and load from local or push to Hub and load from Hub)
from transformers import BertConfig, BertModel
# if model is on hugging face Hub
model = BertModel.from_pretrained("bert-base-uncased")
# from local folder
model = BertModel.from_pretrained("./test/saved_model/")
Another cool thing you can use is pipeline API, it will make your life much easier . With pipelines, you will not have to deal with internals of the model or tokenizer to infer with the model, you simply give the folder and it will make the model ready to infer for you.
I found the error: instead of model = DistilBertModel.from_pretrained(path)
I changed to model = AutoModelForSequenceClassification.from_pretrained(path)
@slowturtle Just to avoid confusion for future, the BertModel classes are simply BERT models without classification heads on top, so the heads include classification heads (and thus logit processors).
I might be late but the tutorial that you have shared is excellent. My only questions is that can the same model be trained for a Multiclass text classification problem as well? If so, what parameters do I need to keep in mind while training this model? and also will this be successful for smaller datasets (<1000 records). It will be great to see if you have a notebook for this problem statement as well that I have just described
I run out of CUDA memory when saving a larger model using this. Is there a way I can move a gpu trained model to ‘cpu’ before saving using trainer.save_model(_). Appreciate the help, thanks!
Hello. After running a distilbert model, finetuned with my own custom dataset for classification purposes, i try to save the model in a .pth file format (e.g. distilmodel.pth). After training the model using the Trainer from the pytorch library, it saves a couples of archives into a checkpoint output folder, as declared into the Trainer’s arguments.
Any help to convert the checkpoint into a model.pth format file?
Thanks in advance.