Model is not properly moved to GPU memory with torch.no_grad()

Hi everyone,

I’m using OWLViTForObjectDetection model and I want to perform inference on the GPU. So what I’m doing is something like:

model = model.to(device='cuda')
with torch.no_grad():
    model.eval()
    data = data.to(device='cuda')
    # inference code

It seems that the inclusion of torch.no_grad() is probably causing some of the model’s parameters to not be copied in the GPU memory because I’m getting an error that all tensors should be on the same device but at least two different devices were found (cuda and cpu). If I remove torch.no_grad() the error does not happen but then I get an out of memory error because all the model’s activation are kept in GPU memory for gradient calculation.

This has not happened to me ever in the past with various models that I’ve been using, so I’m wondering whether it is particularly related to HuggingFace models. Have this occurred to anyone else? Are there any known workarounds for this?

Thank you!

Hi @ekazakos,

Your are probably getting a GPU error unrelated to torch.no_grad() if you installed the PyPI release of transformers with pip install transformers. Sorry about that! This issue was fixed a few weeks ago and you should be able to run the model without any problems if you install the development branch instead:
pip install -q git+https://github.com/huggingface/transformers.git

In general, there is no need to call the eval() method within torch.no_grad(). If your issue persists, could you copy paste the minimal code to reproduce the error and the full error log?

model = model.to(device='cuda')
model.eval()
with torch.no_grad():
    data = data.to(device='cuda')

Hope this helps!

Thank you @adirik !! Will shortly try and let you know!

1 Like

Note that PyTorch moves a model in-place, so it’s sufficient to do:

model.to(device:"cuda")
2 Likes

Thanks! I’m using PyTorch a few years now and I didn’t know about this :grinning:

Hi @adirik,

It works! Thank you! :heart:

1 Like