How to increase quality of fine-tuned text-to-image LoRa?

fw1zr · November 12, 2023, 9:56am

I followed the diffusers to documentation to create a fine-tuned text to image LoRa model for a certain subject. I have images and captions of this subject doing various things: The dataset can be found here: fw1zr/rahul-gandhi-captions · Datasets at Hugging Face.

I followed the diffusers docs for training a text to image LoRa on Stable-Diffusion-v1-5 and trained on a 16GB gpu for over 7 hours but after inferencing I find that the generated outputs are very distorted and low quality.

prompt: photo of rahul gandhi, smiling, beard look, wearing glasses, speaking, with one hand up

Here is the script I used for inferencing:

import torch
from diffusers import StableDiffusionPipeline, DPMSolverMultistepScheduler


model_base = "runwayml/stable-diffusion-v1-5"
pipe = StableDiffusionPipeline.from_pretrained(model_base, torch_dtype=torch.float16, use_safetensors=True)

pipe.unet.load_attn_procs("BootesVoid/rahul-gandhi-lora")
pipe.to("cuda")


generator = torch.Generator("cuda").manual_seed(17677)
image = pipe(
    "photo of rahul gandhi, walking", 
    num_inference_steps=100, 
    guidance_scale=7.5, 
    generator = generator,
    cross_attention_kwargs={"scale":0.7}
).images[0]
image

The model can be found here: BootesVoid/rahul-gandhi-lora · Hugging Face

How do I make it such that this model produces high quality photorealistic output? Do I have to switch to SDXL for fine-tuning or add some sort of upscaler to the pipeline? Or am I not inferencing correctly?

Topic		Replies	Views
Creation of Images from Text-Prompt (Customized Training) Beginners	37	513	January 15, 2025
Additional training of models Beginners	1	147	October 5, 2024
Finetuning Latent Upscale 🧨 Diffusers	1	507	December 6, 2023
How to use fine tuned a pre-trained text to image model? 🧨 Diffusers	0	41	August 22, 2024
Loading txt2img LoRA after training leads to noise images 🧨 Diffusers	2	256	August 28, 2024

How to increase quality of fine-tuned text-to-image LoRa?

Related topics