Difference between calling model() and using Trainer()?

blagav · October 22, 2020, 2:55pm

Hi All,

I was wondering if there is any tangible difference between calling model() and feeding data in manually, or using the Trainer() object?

If I run a program to batch my data and feed it manually, will the training results be the same as using Trainer()?

I only ask because I am getting some errors while applying the language modelling protocol (from: https://colab.research.google.com/github/huggingface/blog/blob/master/notebooks/01_how_to_train.ipynb#scrollTo=YZ9HSQxAAbme) with the Trainer() module but manually seems to work fine.

BramVanroy · October 22, 2020, 9:34pm

This might be related, as it discusses a recently fixed bug with a Colan notebook.

But to answer your question concerning the results: the results when using the Trainer or your own trainer should be the same as long as you use the same loss function and hyperparameters.

sgugger · October 23, 2020, 1:09pm

Trainer is mostly there to take the boilerplate out of your way, especially for mixed-precision training, distributed training and TPU training, but it only does the training loop (with good hyperparameter defaults) so it should match your manual training loop.

blagav · October 23, 2020, 4:00pm

Wonderful, thanks for the clarification!

blagav · October 23, 2020, 4:04pm

I was dealing with a different error: RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

I will look into it!

abellaiche · November 18, 2020, 12:20pm

Did you find out what was wrong?
I’m running into the same kind of error with the Trainer object

blagav · November 19, 2020, 12:38am

Unfortunately I never got it to work with the Transformer-XL implementation I was working on, but I modified BERT to fit my application and it works with that instead

Topic		Replies	Views
If I use trainer.train() and then save the model, is that still useful? Beginners	4	2748	June 24, 2022
Difference between model.train and trainer.train Beginners	0	409	February 26, 2023
Trainer API Error when Hyperparameter Tuning with Custom Loss Function Beginners	1	637	May 30, 2024
Using hyperparameter-search in Trainer 🤗Transformers	101	38105	July 2, 2024
[Urgent] trainer.predict() and model.generate creates totally different predictions 🤗Transformers	4	6885	February 1, 2021

Difference between calling model() and using Trainer()?

Related topics