Hi,
I’m trying to run the following example script diffusers/examples/unconditional_image_generation/train_unconditional.py at main · huggingface/diffusers · GitHub for training stable diffusion model on my dataset. I followed all the steps like in the tutorial and I configured Accelerate as following:
In which compute environment are you running?
Please select a choice using the arrow or number keys, and selecting with enter
- This machine
Which type of machine are you using?
Please select a choice using the arrow or number keys, and selecting with enter - No distributed training
Do you want to run your training on CPU only (even if a GPU / Apple Silicon / Ascend NPU device is available)? [yes/NO]:NO
Do you wish to optimize your script with torch dynamo?[yes/NO]:NO
Do you want to use DeepSpeed? [yes/NO]: NO
What GPU(s) (by id) should be used for training on this machine as a comma-seperated list? [all]:[all]
Do you wish to use FP16 or BF16 (mixed precision)?
Please select a choice using the arrow or number keys, and selecting with enter - no
When running accelerate env in the terminal:
Copy-and-paste the text below in your GitHub issue
Accelerate
version: 0.28.0- Platform: Windows-10-10.0.19045-SP0
- Python version: 3.10.9
- Numpy version: 1.26.3
- PyTorch version (GPU?): 2.2.2+cu118 (True)
- PyTorch XPU available: False
- PyTorch NPU available: False
- System RAM: 13.86 GB
- GPU type: NVIDIA GeForce GTX 1650 (ignore my poor GPU XD, i’m a student and a begginer in ML)
Accelerate
default config:
- compute_environment: LOCAL_MACHINE
- distributed_type: NO
- mixed_precision: no
- use_cpu: False
- debug: False
- num_processes: 1
- machine_rank: 0
- num_machines: 1
- gpu_ids: [all]
- rdzv_backend: static
- same_network: True
- main_training_function: main
- downcast_bf16: no
- tpu_use_cluster: False
- tpu_use_sudo: False
- tpu_env:
But when I launch the script using the command in the tutorial, I see that Accelerate is not using my GPU, but the CPU:
accelerate launch train_unconditional.py --dataset_name=“mihaien/my-dataset” --resolution=64 --center_crop
–random_flip --output_dir=“ddpm-metaphors-64” --train_batch_size=16 --num_epochs=50 --gradient_accumulation_steps=1 --use_ema --learning_rate=1e-4 --lr_warmup_steps=500 --mixed_p
recision=no --push_to_hub
04/03/2024 12:05:07 - INFO - main - Distributed environment: NO
Num processes: 1
Process index: 0
Local process index: 0
Device: cpu
Mixed precision type: no
I also tried a solution that I found, but it doesn’t seem to work for me: Accelerate on single GPU doesnt seem to work - #2 by xtcgoat
PS: I’m using PyCharm.
Could someone please help me? Thank you so much!