Accelerator OOM

adamits · June 2, 2023, 12:53am

Hi,

I am using Accelerate to run distributed inference (in which I use scores from a pretrained model to do other things in a program). Currently I am getting an OOM error pretty far into my eval dataset. Since I have a fixed batch size and pad all batches to the same max sequence length, it seems like the model/a particular batch being too big are not the problem.

I was hoping someone could help me understand more clearly what happens under the hood so I can debug.

When I run:

accelerator = Accelerator()
dataloader, NLI_model = accelerator.prepare(
     dataloader, NLI_model
)

Does the entire dataloader get put on GPU by the prepare method? So if I store a bunch of samples on my dataset, do these sit in GPU RAM? Or do they get put on GPU in the collator?
Since I am running on several datasets, I call accelerator.prepare many times. Could this be somehow accumulating memory?
- Should I put everything in one dataset first?
- Should I be using the same accelerator, and if so, do I need to do something to clear the old dataloader before adding a new one?

Thanks!

smangrul · June 23, 2023, 9:48am

Please prepare model only once, let us know if that helps

adamits · July 5, 2023, 9:56pm

It does not help. I suspect this does not impact memory usage at all, though perhaps it gives me a slight benefit to runtime?

Topic		Replies	Views
Model.generate() OOM on 1 of 2 GPUs? 🤗Transformers	4	1683	March 4, 2022
Data Parallel Multi GPU Inference 🤗Accelerate	9	4661	September 15, 2023
GPU OOM when training Beginners	2	3210	October 20, 2021
Inflated GPU memory footprint of model prepared via accelerate 🤗Accelerate	5	763	September 15, 2023
Hugging face accelerate and torch DDP crash with out-of-memory errors for a model runs fine on a single GPU 🤗Accelerate	3	4432	January 1, 2024

Accelerator OOM

Related topics