If I use trainer.train() and then save the model, is that still useful?

I’m wondering if I run something with trainer vs model.train() and then save the model, is that still useful or do I need to save the trainer?

And then, how do I load it? as a trainer or as a model?

1 Like

Hi,

When the model. train() completed and the model saved with torch.save(model, ‘model.pth’) , I will use torch.load(‘model.pth’).

If I load the model from the pretrained model before using the trainer(), after fine-tuning, I will still use the from pretrained function to load the fine-tune model from the saved checkpoint.

You can test small models to see if both loading ways work or not.

Do I need to do torch.save() ? I have no issues just asking because I was simply trying to do model.save_pretrained() (no available for a T5 model) and then did model.save() but nothing happened.

Right now I’m using the trainer in trainer.train() and I guess I can then save said model?

What I do is I load a pretrained one like so:

model = AutoModelForSeq2SeqLM.from_pretrained("t5-large")

Then I use the trainer like so:

trainer = Seq2SeqTrainer(
    model=model,
    args=training_args,
    train_dataset=train_tokenized_books,
    eval_dataset=eval_tokenized_books,
    tokenizer=tokenizer,
    data_collator=data_collator,
)

trainer.train()

Can I do afterwards torch.save(model, 'model.pth') like you mention or did I screw up somehow?

There is a trainer.save_model method that will save it for you in a format that is compatible with from_pretrained.

1 Like

ooh ok! Thank you!

I have another question if I may, what is the difference per se between using trainer vs model to train? as in trainer.train() vs model.train() is there any practical difference really? I’ve been looking at the docs but can’t quite figure it out.