Trainer() and required_grad=false

hey guys i just have a quick question, if i freeze 12 out of 16 layers, does huggingface trainer() load everything into memory anyway?

also curious @Dampish Did you solved the problem?