'CUDA error: all CUDA-capable devices are busy or unavailable" when using

grumpy · March 14, 2022, 1:19am

When I try to run the “http://api-inference.huggingface.co/gpu” interface, I get the error

{'error': 'CUDA error: all CUDA-capable devices are busy or unavailable\nCUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.\nFor debugging consider passing CUDA_LAUNCH_BLOCKING=1.'}

The same code works with http://api-inference.huggingface.co/cpu

Anything I am missing to use accelerated inference with gpu?

Topic		Replies	Views
Cuda out of memory error when using Inference API 🤗Hub	0	946	August 11, 2022
Trouble Invoking GPU-Accelerated Inference Beginners	5	1460	October 3, 2022
Is this CUDA memory error on Inference API coming from HuggingFace or Google Collab? Beginners	0	608	July 20, 2021
ERROR \| Expected a cuda device, but got: cpu Inference Endpoints on the Hub	1	949	January 1, 2024
Can't inference on the train model due to some cuda problem Beginners	0	803	June 6, 2023

'CUDA error: all CUDA-capable devices are busy or unavailable" when using

Related topics