AutoTrain Advanced: FATAL ERROR: NVIDIA Management Library (NVML) not found

Hi, I’m following along this video, but instead of using a zip file, I’m using the chest x-ray dataset hosted on the hub from Julien Simon’s video.

I’m getting the following error in my logs when I click on Train:

FATAL ERROR: NVIDIA Management Library (NVML) not found.
HINT: The NVIDIA Management Library ships with the NVIDIA display driver (available at
      https://www.nvidia.com/Download/index.aspx), or can be downloaded as part of the
      NVIDIA CUDA Toolkit (available at https://developer.nvidia.com/cuda-downloads).
      The lists of OS platforms and NVIDIA-GPUs supported by the NVML library can be
      found in the NVML API Reference at https://docs.nvidia.com/deploy/nvml-api.

I’ve already added my HF_TOKEN, and tried:

  1. Restarting the Space
  2. Factory Reboot

Why is this error occurring? What can I do to fix this error?

For context, I’m creating this project to teach young students on how to leverage the no-code interface on Hugging Face and Teachable Machine Learning.

you need to specify a train split.

Im having a similar error, the training split is not fixing it. Its suggesting this "pip3 install --force-reinstall nvidia-ml-py
"

1 Like

Facing same issue

thats just a warning. ignore it.

1 Like

alueError: The least populated class in y has only 1 member, which is too few. The minimum number of groups for any class cannot be less than 2.
FATAL ERROR: NVIDIA Management Library (NVML) not found.
HINT: The NVIDIA Management Library ships with the NVIDIA display driver (available at
Official Drivers | NVIDIA), or can be downloaded as part of the
NVIDIA CUDA Toolkit (available at CUDA Toolkit 12.4 Update 1 Downloads | NVIDIA Developer).
The lists of OS platforms and NVIDIA-GPUs supported by the NVML library can be
found in the NVML API Reference at NVML API Reference Guide :: GPU Deployment and Management Documentation.
FATAL ERROR: NVIDIA Management Library (NVML) not found.
HINT: The NVIDIA Management Library ships with the NVIDIA display driver (available at
Official Drivers | NVIDIA), or can be downloaded as part of the
NVIDIA CUDA Toolkit (available at CUDA Toolkit 12.4 Update 1 Downloads | NVIDIA Developer).
The lists of OS platforms and NVIDIA-GPUs supported by the NVML library can be
found in the NVML API Reference at NVML API Reference Guide :: GPU Deployment and Management Documentation.
FATAL ERROR: NVIDIA Management Library (NVML) not found.
HINT: The NVIDIA Management Library ships with the NVIDIA display driver (available at
Official Drivers | NVIDIA), or can be downloaded as part of the
NVIDIA CUDA Toolkit (available at CUDA Toolkit 12.4 Update 1 Downloads | NVIDIA Developer).
The lists of OS platforms and NVIDIA-GPUs supported by the NVML library can be
found in the NVML API Reference at NVML API Reference Guide :: GPU Deployment and Management Documentation.

facing the above issue when run auto train

Facing same issue

use gpu for large model

I am facing the same issue when running on local Mac M2 pro

I also got this error when I tried autotrain app
Your installed package nvidia-ml-py is corrupted. Skip patch functions nvmlDeviceGet{Compute,Graphics,MPSCompute}RunningProcesses. You may get incorrect or incomplete results. Please consider reinstall package nvidia-ml-py via pip3 install --force-reinstall nvidia-ml-py nvitop.
Your installed package nvidia-ml-py is corrupted. Skip patch functions nvmlDeviceGetMemoryInfo. You may get incorrect or incomplete results. Please consider reinstall package nvidia-ml-py via pip3 install --force-reinstall nvidia-ml-py nvitop.

this is not an error. you can ignore it and move on. on macbook, you need to disable quantization.

ok removed as you mentioned and it ran as expected
But I can see a error mentioning No GPU support for bitsandbytes
i can see that it does not support mps yet (Support for Apple silicon · Issue #252 · bitsandbytes-foundation/bitsandbytes · GitHub)

I get these errors

/Users/temme/Documents/pogo/framework-validator/autotrain/lib/python3.10/site-packages/bitsandbytes/cextension.py:34: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
warn("The installed version of bitsandbytes was compiled without GPU support. "
‘NoneType’ object has no attribute ‘cadam32bit_grad_fp32’

my question is does it use CPU or GPU?

you need to set quantization to none. please follow the params here: How to Finetune phi-3 on MacBook Pro

thanks abhishek I can able to run it with TinyLlama