Accelerator OOM


I am using Accelerate to run distributed inference (in which I use scores from a pretrained model to do other things in a program). Currently I am getting an OOM error pretty far into my eval dataset. Since I have a fixed batch size and pad all batches to the same max sequence length, it seems like the model/a particular batch being too big are not the problem.

I was hoping someone could help me understand more clearly what happens under the hood so I can debug.

When I run:

accelerator = Accelerator()
dataloader, NLI_model = accelerator.prepare(
     dataloader, NLI_model
  1. Does the entire dataloader get put on GPU by the prepare method? So if I store a bunch of samples on my dataset, do these sit in GPU RAM? Or do they get put on GPU in the collator?
  2. Since I am running on several datasets, I call accelerator.prepare many times. Could this be somehow accumulating memory?
    • Should I put everything in one dataset first?
    • Should I be using the same accelerator, and if so, do I need to do something to clear the old dataloader before adding a new one?


Please prepare model only once, let us know if that helps

It does not help. I suspect this does not impact memory usage at all, though perhaps it gives me a slight benefit to runtime?