kohya_SS (Output Interpretation)

Hello

I have trained the kohya_ss model (stabilityai/stable-diffusion-xl-base-1.0) with 10 images. I was wondering where the output comes from (from the base model or my customized training).

How much % is the final output composed of ?
Eg:
(Base Model:60%, Customized Training:40%)
(Base Model:70%, Customized Training:30%)

For example:
The prompt is: DNA has to be shown in the background with a Indain-Woman-with-Mouth-Cancer in the Foreground

And the image created by the program is:

The program is:

from diffusers import AutoPipelineForText2Image, AutoencoderKL
import torch
import os
import numpy as np
from PIL import Image

print("vae")

# Clear GPU memory before starting 
torch.cuda.empty_cache() 

# Set seed for reproducibility 
#torch.manual_seed(6666666) 
#np.random.seed(6666666)

# Define the path to the directory containing your model and LoRA weights
print("Define the path to the directory containing your model and LoRA weights")
model_dir = "D:\\Ganu\\AIImage\\huggingface\\kohya_ss\\kohya_ss\\trained-model\\model\\" 
lora_weights_path = os.path.join(model_dir, "last.safetensors")

# Load the base model using StableDiffusionPipeline
print("Load the base model using StableDiffusionPipeline")
model_id = "stabilityai/stable-diffusion-xl-base-1.0"
adapter_id = "wangfuyun/PCM_SDXL_LoRAs"

#vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
pipeline = AutoPipelineForText2Image.from_pretrained(model_id, torch_dtype=torch.float32, variant="fp16").to("cpu")
pipeline.enable_sequential_cpu_offload()
pipeline.enable_attention_slicing("max")

# Load the LoRA weights
print("Load the LoRA weights")
try:
    pipeline.load_lora_weights(lora_weights_path, weight_name="last.safetensors")
except ValueError as e:
    print("Invalid LoRA checkpoint. Please check the compatibility and format of the weights file.")
    raise e

# Generate an image from a text prompt
print("Generate an image from a text prompt")
text_prompt = "DNA has to be shown in the background with a Indain-Woman-with-Mouth-Cancer in the Foreground"
generated_image = pipeline(prompt=text_prompt).images[0]
generated_image.save("generated_image.png")
generated_image.show()
1 Like

Good evening. That question is essentially impossible to answer…:sweat_smile:

The answer would be something like “it depends on the base model”, “it depends on what you want to express with LoRA (if it’s something like the characteristics of a person or a character, then LoRA will have a big impact)”, or “it can’t be expressed as a percentage in the first place”.

This is because the base model and LoRA are fused together when inference is executed. The mixed neural network is no longer suitable for being expressed as a percentage.

LoRA is not the same as full fine tuning, but it is one of the methods for training models, and there are various LoRA algorithms, each with their own strengths and weaknesses. (I am not familiar with each algorithm.)

2 Likes

Hello

Can I get the last.safetensors weights file (for the model: stabilityai/stable-diffusion-xl-base-1.0) without my customized training (the original one)? So I can check the difference from my customized training?

1 Like

Hmmm? How do you want it to be?:thinking:

Sorry, didn’t get your question?

1 Like

Yea. I didn’t understand it very well. I think you want to do something for comparison…

When I do training with kohya_ss (LORA), it generates a (last.safetensors) file which I use for image generation.

What I want is a original file (last.safetensors) without the changes done due to my training?

1 Like

For example, the following code:

from diffusers import AutoPipelineForText2Image, AutoencoderKL
import torch
import os
import numpy as np
from PIL import Image

print("vae")

# Clear GPU memory before starting 
torch.cuda.empty_cache() 

# Set seed for reproducibility 
#torch.manual_seed(6666666) 
#np.random.seed(6666666)

# Define the path to the directory containing your model and LoRA weights
print("Define the path to the directory containing your model and LoRA weights")
model_dir = "D:\\Ganu\\AIImage\\huggingface\\kohya_ss\\kohya_ss\\trained-model\\model\\" 
lora_weights_path = os.path.join(model_dir, "last.safetensors")

# Load the base model using StableDiffusionPipeline
print("Load the base model using StableDiffusionPipeline")
model_id = "stabilityai/stable-diffusion-xl-base-1.0"
adapter_id = "wangfuyun/PCM_SDXL_LoRAs"

#vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
pipeline = AutoPipelineForText2Image.from_pretrained(model_id, torch_dtype=torch.float32, variant="fp16").to("cpu")
pipeline.enable_sequential_cpu_offload()
pipeline.enable_attention_slicing("max")

# Load the LoRA weights
print("Load the LoRA weights")
try:
    pipeline.load_lora_weights(lora_weights_path, weight_name="last.safetensors")
except ValueError as e:
    print("Invalid LoRA checkpoint. Please check the compatibility and format of the weights file.")
    raise e

# Generate an image from a text prompt
print("Generate an image from a text prompt")
text_prompt = "DNA has to be shown in the background, and a Indain Woman with Skin Disease in the Foreground"
generated_image = pipeline(prompt=text_prompt).images[0]
generated_image.save("generated_image.png")
generated_image.show()

generates the image:

Whereas the following code:

from diffusers import AutoPipelineForText2Image, AutoencoderKL
import torch
import os
import numpy as np
from PIL import Image

print("vae")

# Clear GPU memory before starting 
torch.cuda.empty_cache() 

# Set seed for reproducibility 
#torch.manual_seed(6666666) 
#np.random.seed(6666666)

# Load the base model using StableDiffusionPipeline
print("Load the base model using StableDiffusionPipeline")
model_id = "stabilityai/stable-diffusion-xl-base-1.0"
adapter_id = "wangfuyun/PCM_SDXL_LoRAs"

#vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
pipeline = AutoPipelineForText2Image.from_pretrained(model_id, torch_dtype=torch.float32, variant="fp16").to("cpu")
pipeline.enable_sequential_cpu_offload()
pipeline.enable_attention_slicing("max")


# Generate an image from a text prompt
print("Generate an image from a text prompt")
text_prompt = "DNA has to be shown in the background, and a Indain Woman with Skin Disease in the Foreground"
generated_image = pipeline(prompt=text_prompt).images[0]
generated_image.save("generated_image.png")
generated_image.show()

generates the following image:

The two images generated are very different.

I was wondering why…

1 Like

The two images generated are very different.

I think this is because the latter code does not apply last.safetensors (LoRA). Also, if you want to keep both the pre-training and post-training models in KohyaSS, you need to specify an option…

1 Like

Hello,

I am getting great images from the program without LORA. So if I want to retain the core design (without LORA) and then apply my LORA fine-tuning on it to apply cosmetic changes (all in one go!), how can I achieve that?

Please advise. Thank You.

1 Like

Good evening.:grinning:

I see. You want to train and apply LoRA to the extent that it doesn’t erase the goodness of the base model.
One way to do this is to lower the weight (scale) below 1.0 when applying LoRA that has already been trained.
Another way is to specify, using parameters, how much of the training data to include in the training using LoRA. In the case of KohyaSS, the parameters are as follows.

When applying LoRA

When training LoRA

Hi John6666,

There are a lot of “Training Parameters”. Is there a default value for all of them, or will I have to do a lot of “trial and errors” with each of them?

1 Like

Is there a default value for all of them,

Here.

or will I have to do a lot of “trial and errors” with each of them

Or search parameters for similar use-case?:sweat_smile:

1 Like

Automated hyperparameter optimization (Optuna)?

1 Like

Existing semi-automatic training scripts such as Kohya SS and OneTrainer use parameters that are within a certain range of acceptability from the start.
So it would probably be faster to search for know-how on how to create LoRA for similar use cases and borrow the detailed parameters.:sweat_smile:

I think that Optuna and other tools are more like frameworks for finding parameters when fine-tuning models fully manually.

1 Like

Would this be a good start?

How to Train a Highly Convincing Real-Life LoRA Model - MyAIForce.

1 Like

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.