I have trained the kohya_ss model (stabilityai/stable-diffusion-xl-base-1.0) with 10 images. I was wondering where the output comes from (from the base model or my customized training).
How much % is the final output composed of ?
Eg:
(Base Model:60%, Customized Training:40%)
(Base Model:70%, Customized Training:30%)
For example:
The prompt is: DNA has to be shown in the background with a Indain-Woman-with-Mouth-Cancer in the Foreground
from diffusers import AutoPipelineForText2Image, AutoencoderKL
import torch
import os
import numpy as np
from PIL import Image
print("vae")
# Clear GPU memory before starting
torch.cuda.empty_cache()
# Set seed for reproducibility
#torch.manual_seed(6666666)
#np.random.seed(6666666)
# Define the path to the directory containing your model and LoRA weights
print("Define the path to the directory containing your model and LoRA weights")
model_dir = "D:\\Ganu\\AIImage\\huggingface\\kohya_ss\\kohya_ss\\trained-model\\model\\"
lora_weights_path = os.path.join(model_dir, "last.safetensors")
# Load the base model using StableDiffusionPipeline
print("Load the base model using StableDiffusionPipeline")
model_id = "stabilityai/stable-diffusion-xl-base-1.0"
adapter_id = "wangfuyun/PCM_SDXL_LoRAs"
#vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
pipeline = AutoPipelineForText2Image.from_pretrained(model_id, torch_dtype=torch.float32, variant="fp16").to("cpu")
pipeline.enable_sequential_cpu_offload()
pipeline.enable_attention_slicing("max")
# Load the LoRA weights
print("Load the LoRA weights")
try:
pipeline.load_lora_weights(lora_weights_path, weight_name="last.safetensors")
except ValueError as e:
print("Invalid LoRA checkpoint. Please check the compatibility and format of the weights file.")
raise e
# Generate an image from a text prompt
print("Generate an image from a text prompt")
text_prompt = "DNA has to be shown in the background with a Indain-Woman-with-Mouth-Cancer in the Foreground"
generated_image = pipeline(prompt=text_prompt).images[0]
generated_image.save("generated_image.png")
generated_image.show()
Good evening. That question is essentially impossible to answerâŚ
The answer would be something like âit depends on the base modelâ, âit depends on what you want to express with LoRA (if itâs something like the characteristics of a person or a character, then LoRA will have a big impact)â, or âit canât be expressed as a percentage in the first placeâ.
This is because the base model and LoRA are fused together when inference is executed. The mixed neural network is no longer suitable for being expressed as a percentage.
LoRA is not the same as full fine tuning, but it is one of the methods for training models, and there are various LoRA algorithms, each with their own strengths and weaknesses. (I am not familiar with each algorithm.)
Can I get the last.safetensors weights file (for the model: stabilityai/stable-diffusion-xl-base-1.0) without my customized training (the original one)? So I can check the difference from my customized training?
from diffusers import AutoPipelineForText2Image, AutoencoderKL
import torch
import os
import numpy as np
from PIL import Image
print("vae")
# Clear GPU memory before starting
torch.cuda.empty_cache()
# Set seed for reproducibility
#torch.manual_seed(6666666)
#np.random.seed(6666666)
# Define the path to the directory containing your model and LoRA weights
print("Define the path to the directory containing your model and LoRA weights")
model_dir = "D:\\Ganu\\AIImage\\huggingface\\kohya_ss\\kohya_ss\\trained-model\\model\\"
lora_weights_path = os.path.join(model_dir, "last.safetensors")
# Load the base model using StableDiffusionPipeline
print("Load the base model using StableDiffusionPipeline")
model_id = "stabilityai/stable-diffusion-xl-base-1.0"
adapter_id = "wangfuyun/PCM_SDXL_LoRAs"
#vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
pipeline = AutoPipelineForText2Image.from_pretrained(model_id, torch_dtype=torch.float32, variant="fp16").to("cpu")
pipeline.enable_sequential_cpu_offload()
pipeline.enable_attention_slicing("max")
# Load the LoRA weights
print("Load the LoRA weights")
try:
pipeline.load_lora_weights(lora_weights_path, weight_name="last.safetensors")
except ValueError as e:
print("Invalid LoRA checkpoint. Please check the compatibility and format of the weights file.")
raise e
# Generate an image from a text prompt
print("Generate an image from a text prompt")
text_prompt = "DNA has to be shown in the background, and a Indain Woman with Skin Disease in the Foreground"
generated_image = pipeline(prompt=text_prompt).images[0]
generated_image.save("generated_image.png")
generated_image.show()
from diffusers import AutoPipelineForText2Image, AutoencoderKL
import torch
import os
import numpy as np
from PIL import Image
print("vae")
# Clear GPU memory before starting
torch.cuda.empty_cache()
# Set seed for reproducibility
#torch.manual_seed(6666666)
#np.random.seed(6666666)
# Load the base model using StableDiffusionPipeline
print("Load the base model using StableDiffusionPipeline")
model_id = "stabilityai/stable-diffusion-xl-base-1.0"
adapter_id = "wangfuyun/PCM_SDXL_LoRAs"
#vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
pipeline = AutoPipelineForText2Image.from_pretrained(model_id, torch_dtype=torch.float32, variant="fp16").to("cpu")
pipeline.enable_sequential_cpu_offload()
pipeline.enable_attention_slicing("max")
# Generate an image from a text prompt
print("Generate an image from a text prompt")
text_prompt = "DNA has to be shown in the background, and a Indain Woman with Skin Disease in the Foreground"
generated_image = pipeline(prompt=text_prompt).images[0]
generated_image.save("generated_image.png")
generated_image.show()
I think this is because the latter code does not apply last.safetensors (LoRA). Also, if you want to keep both the pre-training and post-training models in KohyaSS, you need to specify an optionâŚ
I am getting great images from the program without LORA. So if I want to retain the core design (without LORA) and then apply my LORA fine-tuning on it to apply cosmetic changes (all in one go!), how can I achieve that?
I see. You want to train and apply LoRA to the extent that it doesnât erase the goodness of the base model.
One way to do this is to lower the weight (scale) below 1.0 when applying LoRA that has already been trained.
Another way is to specify, using parameters, how much of the training data to include in the training using LoRA. In the case of KohyaSS, the parameters are as follows.
There are a lot of âTraining Parametersâ. Is there a default value for all of them, or will I have to do a lot of âtrial and errorsâ with each of them?
Existing semi-automatic training scripts such as Kohya SS and OneTrainer use parameters that are within a certain range of acceptability from the start.
So it would probably be faster to search for know-how on how to create LoRA for similar use cases and borrow the detailed parameters.
I think that Optuna and other tools are more like frameworks for finding parameters when fine-tuning models fully manually.