Accelerate on single GPU doesnt seem to work

xtcgoat · May 15, 2023, 9:03am

Im new to the huggingface community and to ML and starting playing around with accelerate and followed the instruction set out in the tutorials.

The issue i seem to be having is that i have used the accelerate config and set my machine to use my GPU, but after looking at the resource monitor my GPU usage is only at 7% i dont think my training is using my GPU at all, i have a 3090TI.

here is the command that i used to start my training

accelerate launch train_unconditional.py
–train_data_dir=“data”
–resolution=256
–center_crop
–random_flip
–output_dir=“output-256”
–train_batch_size=2
–save_model_epochs=5
–num_epochs=100
–gradient_accumulation_steps=1
–use_ema
–learning_rate=1e-4
–lr_warmup_steps=500
–mixed_precision=no

Is there maybe a library that i am missing or some other config setting i need to change, because in the accelerate documentation i can only find info about multi gpu training.

xtcgoat · May 15, 2023, 9:06am

I expect my GPU usage to increase, thats how i assume that my training is using my GPU, i might be wrong which is why i am reaching out.

xtcgoat · May 16, 2023, 12:54am

After many hours of trying to figure this out i found the solution to this.

I uninstalled my version of Python (I was using version 3.11.3)
I also uninstalled my version of Nvidia Cuda (I was using version 12.1).

My new Approach

I went to Pytorch Website and found the latest CUDA version to use (11.8)

Installed Python Version 3.10.9
installed Cuda Version 11.8 from Nvidia website.

Created my new python environment and ran the following pip commands.

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install git+https://github.com/huggingface/diffusers
pip install accelerate datasets tensorboard

and the went through my accelerate config and when i started training my GPU was being used.
The important part was to ensure that the CUDA version from Nvidia matched the CUDA version from the PyTorch website, and that Nvidia CUDA was installed BEFORE installing CUDA through PIP.

Topic		Replies	Views
Accelerate doesn't seem to use my GPU? 🤗Accelerate	7	5702	September 18, 2024
Having trouble accelerate on my 2 GPU machine Beginners	0	736	May 24, 2023
Multi gpu not working 🤗Transformers	2	2212	February 3, 2023
Multi-GPU Issue when trying Diffusers demo 🤗Accelerate	0	560	June 16, 2024
Multi-GPU Training using Accelerate: RAM Issue Leading to Failure 🤗Accelerate	0	93	July 16, 2024

Accelerate on single GPU doesnt seem to work

Related topics