I have a LlamaForCausalLM model. I want to do a single run of backprop on a single sample (one forward pass, one backward pass) and record all the gradients that are computed in the process. I do not want to actually update the model weights- I just want to record the gradients. The model is pretty big and I only have a single GPU, so to be able to do this I need to use gradient checkpointing. Is there a way to use a Trainer to accomplish this? Thanks!
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Using gradient_checkpointing=True in Trainer causes error with LLaMA | 1 | 2292 | July 8, 2023 | |
Gradient_checkpointing control | 0 | 888 | August 10, 2023 | |
No benefit from turning on gradient_checkpointing: True | 1 | 146 | October 24, 2024 | |
Gradient Checkpointing with FSDP efficiency | 0 | 531 | August 20, 2023 | |
Can we use Gradient Checkpointing and Gradient Accumulation at Once? | 1 | 1209 | September 14, 2021 |