You’re spot on! If requires_ grad
isn’t set to False
for earlier layers, Py Torch ends up training the whole model instead of just the last layer. Freezing the earlier layers by setting requires_ grad=False` helps focus training where it’s needed.