Training controlNet on 8gb GPU

Hi, I’m trying to train a controlNet on the basic fill50k dataset (the controlnet example on the diffusers repo).
Using all the requirements provided in the example results in my model not converging.
Has anyone been able to train with those configurations?

accelerate launch train_controlnet.py
–pretrained_model_name_or_path=$MODEL_DIR
–output_dir=$OUTPUT_DIR
–dataset_name=fusing/fill50k
–resolution=512
–validation_image “./conditioning_image_1.png” “./conditioning_image_2.png”
–validation_prompt “red circle with blue background” “cyan circle with brown floral background”
–train_batch_size=1
–gradient_accumulation_steps=4
–gradient_checkpointing
–enable_xformers_memory_efficient_attention
–set_grads_to_none
–mixed_precision fp16

Here are some results after 12000 steps:
image

Additionally, can anyone share a profile of the error as a function of the steps for controlNet?
Is the error gradually reducing or suddenly reducing?

Thanks