Is native Pytorch training loop much slower than Trainer?

DylanAndrew · November 11, 2024, 7:24am

You’re spot on! If requires_ grad isn’t set to False for earlier layers, Py Torch ends up training the whole model instead of just the last layer. Freezing the earlier layers by setting requires_ grad=False` helps focus training where it’s needed.

Topic		Replies	Views
Trainer class optimization for transformer models Models	0	420	January 8, 2022
Is the Trainer slower than customised loops? 🤗Transformers	3	53	July 4, 2025
Fine-tuning T5 with Trainer for novel task Models	1	1161	September 1, 2021
Training General Pytorch model with HuggingFace's Trainer 🤗Transformers	0	396	May 7, 2023
Bug in the train-with-pytorch-trainer? Beginners	2	389	September 1, 2023

Is native Pytorch training loop much slower than Trainer?

Related topics