Train stopped at 0%

adinolfi · May 18, 2024, 3:13pm

My goal is to train a lora model starting from a database. I used this link to help me with the phyton script: LoRA
So I used the script at this link: diffusers/examples/text_to_image/train_text_to_image_lora.py at main · huggingface/diffusers · GitHub
There are two problems:

the script gives an error if I set num of workers greater than 0
instead if I set a num of workers=0, the script does not give errors but the training remains at 0% even if I change some parameters such as the epoch

I used the command line parameters like in the link, changing lambdalabs/pokemon-blip-captions with lambdalabs/naruto-blip-captions.
What could be the error?

adinolfi · May 21, 2024, 8:11am

This is the error…

adinolfi · May 21, 2024, 2:32pm

I discovered that it remains at 0% because the script stops at line 744. I tried to make some prints afterwards and they don’t come out in the logger.
This is the script: diffusers/examples/text_to_image/train_text_to_image_lora.py at e1df77ee1ec0400195ad8dcb2099d137b34c9b9f · huggingface/diffusers · GitHub
Is there a problem initializing the vae variable?

fasterinnerlooper · May 26, 2024, 5:37am

Is it possible that the system is running out of memory before it even starts? If you reduce the batch size to one does it run?

Topic		Replies	Views
Error while training LORA in KOHYA_SS (stabilityai/stable-diffusion-xl-base-1.0) Beginners	21	1135	February 13, 2025
Train_dreambooth_lora is not working! 🧨 Diffusers	1	1164	May 2, 2023
Huggingface trl GRPO loss is always zero Beginners	5	150	May 18, 2025
About Lora Training Script 🧨 Diffusers	1	1459	May 2, 2023
Bert LM pretraining: training loss goes to 0 at masking probability of 0.999 Beginners	2	2314	October 31, 2020

Train stopped at 0%

Related topics