Creation of Images from Text-Prompt (Customized Training)

John6666 · January 13, 2025, 7:59am

It was generated properly (though there is no LoRA)… Is it an issue with the environment, or is the library version not matching?

from diffusers import StableDiffusionPipeline, EulerDiscreteScheduler
import torch
import os
import numpy as np
from PIL import Image

# Define the path to the directory containing your model and LoRA weights
print("Define the path to the directory containing your model and LoRA weights")
model_dir = "D:\\Ganu\\AIImage\\huggingface\\kohya_ss\\kohya_ss\\trained-model\\model\\"
lora_weights_path = os.path.join(model_dir, "last.safetensors")

# Load the base model using StableDiffusionPipeline
print("Load the base model using StableDiffusionPipeline")
pipeline = StableDiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-2-1-base",
    torch_dtype=torch.float16
).to("cuda")

# Generate an image from a text prompt
print("Generate an image from a text prompt")
text_prompt = "A beautiful Woman"
pil_image = pipeline(prompt=text_prompt).images[0]

# Save or display the generated image
print("Save or display the generated image")

# Convert the NumPy array to a PIL Image and save or display the generated image 
pil_image.save("generated_image.jpg") 
pil_image.show()

# https://pytorch.org/get-started/locally/ and
pip intstall -U accelerate peft diffusers

deicool · January 13, 2025, 8:18am

pip install -U accelerate peft diffusers
Requirement already satisfied: accelerate in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (1.2.1)
Requirement already satisfied: peft in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (0.14.0)
Requirement already satisfied: diffusers in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (0.32.1)
Requirement already satisfied: numpy<3.0.0,>=1.17 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from accelerate) (2.2.1)
Requirement already satisfied: packaging>=20.0 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from accelerate) (24.2)
Requirement already satisfied: psutil in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from accelerate) (6.1.1)
Requirement already satisfied: pyyaml in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from accelerate) (6.0.2)
Requirement already satisfied: torch>=1.10.0 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from accelerate) (2.5.1+cu118)
Requirement already satisfied: huggingface-hub>=0.21.0 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from accelerate) (0.27.1)
Requirement already satisfied: safetensors>=0.4.3 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from accelerate) (0.5.2)
Requirement already satisfied: transformers in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from peft) (4.48.0)
Requirement already satisfied: tqdm in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from peft) (4.67.1)
Requirement already satisfied: importlib-metadata in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from diffusers) (8.5.0)
Requirement already satisfied: filelock in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from diffusers) (3.16.1)
Requirement already satisfied: regex!=2019.12.17 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from diffusers) (2024.11.6)
Requirement already satisfied: requests in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from diffusers) (2.32.3)
Requirement already satisfied: Pillow in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from diffusers) (11.1.0)
Requirement already satisfied: fsspec>=2023.5.0 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from huggingface-hub>=0.21.0->accelerate) (2024.12.0)
Requirement already satisfied: typing-extensions>=3.7.4.3 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from huggingface-hub>=0.21.0->accelerate) (4.12.2)
Requirement already satisfied: networkx in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from torch>=1.10.0->accelerate) (3.4.2)
Requirement already satisfied: jinja2 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from torch>=1.10.0->accelerate) (3.1.5)
Requirement already satisfied: sympy==1.13.1 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from torch>=1.10.0->accelerate) (1.13.1)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from sympy==1.13.1->torch>=1.10.0->accelerate) (1.3.0)
Requirement already satisfied: colorama in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from tqdm->peft) (0.4.6)
Requirement already satisfied: zipp>=3.20 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from importlib-metadata->diffusers) (3.21.0)
Requirement already satisfied: charset-normalizer<4,>=2 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from requests->diffusers) (3.4.1)
Requirement already satisfied: idna<4,>=2.5 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from requests->diffusers) (3.10)
Requirement already satisfied: urllib3<3,>=1.21.1 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from requests->diffusers) (2.3.0)
Requirement already satisfied: certifi>=2017.4.17 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from requests->diffusers) (2024.12.14)
Requirement already satisfied: tokenizers<0.22,>=0.21 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from transformers->peft) (0.21.0)
Requirement already satisfied: MarkupSafe>=2.0 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from jinja2->torch>=1.10.0->accelerate) (2.1.5)

John6666 · January 13, 2025, 8:23am

Hmm, that doesn’t seem strange.
If I had to say, I’d say that PyTorch is suspicious, but if it’s not working, I think the black image itself won’t be generated and it will crash…
Is the safety_checker’s blackout function being triggered? Was it in 2.1 too?

Edit:
It doesn’t seem to be in 2.1.

deicool · January 13, 2025, 8:38am

I added the omission of “Safety_checker”, but it is still giving a black image:

pipeline = StableDiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-2-1-base",
    torch_dtype=torch.float16,
    safety_checker=None
).to("cuda")

I am getting the following warning:

D:\Ganu\AIImage\huggingface\kohya_ss\Python310\lib\site-packages\diffusers\image_processor.py:147: RuntimeWarning: invalid value encountered in cast
  images = (images * 255).round().astype("uint8")

John6666 · January 13, 2025, 8:56am

generated_image = pipeline(prompt=text_prompt).images[0]

# Handle NaN or infinite values and ensure the range is valid 
print("Handle NaN or infinite values and ensure the range is valid ")
generated_image = np.nan_to_num(generated_image, nan=0.0, posinf=255.0, neginf=0.0) 
generated_image = np.clip(generated_image, 0, 255) 
generated_image = generated_image.astype(np.uint8)

# Save or display the generated image
print("Save or display the generated image")

# Convert the NumPy array to a PIL Image and save or display the generated image 
pil_image = Image.fromarray(generated_image)

For now, the PIL image will be returned from the pipeline, so that’s basically OK. But the problem isn’t here, is it…

pil_image = pipeline(prompt=text_prompt).images[0]

deicool · January 13, 2025, 9:42am

Is it because i have a OLD Lenova Legion Laptop of the make 2022? (I have updated all NVidia drivers though)

John6666 · January 13, 2025, 9:44am

Possibly. If it’s because of a lack of VRAM, this might work. (The missing amount is made up for with normal RAM.)

pipeline = StableDiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-2-1-base",
    torch_dtype=torch.float16,
    device_map="auto",
)#.to("cuda")

deicool · January 13, 2025, 9:49am

D:\Ganu\AIImage\huggingface\kohya_ss\kohya_ss\user>python John-13thJan2024-NoLora.py
Define the path to the directory containing your model and LoRA weights
Load the base model using StableDiffusionPipeline
Traceback (most recent call last):
File “D:\Ganu\AIImage\huggingface\kohya_ss\kohya_ss\user\John-13thJan2024-NoLora.py”, line 14, in
pipeline = StableDiffusionPipeline.from_pretrained(
File “D:\Ganu\AIImage\huggingface\kohya_ss\Python310\lib\site-packages\huggingface_hub\utils_validators.py”, line 114, in _inner_fn
return fn(*args, **kwargs)
File “D:\Ganu\AIImage\huggingface\kohya_ss\Python310\lib\site-packages\diffusers\pipelines\pipeline_utils.py”, line 710, in from_pretrained
raise NotImplementedError(
NotImplementedError: auto not supported. Supported strategies are: balanced

John6666 · January 13, 2025, 9:56am

How about this…

pipeline = StableDiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-2-1-base",
    torch_dtype=torch.float16,
    device_map="balanced",
)#.to("cuda")

deicool · January 14, 2025, 5:04am

I am still getting a blank image output

python John-13thJan2024-NoLora.py
Define the path to the directory containing your model and LoRA weights
Load the base model using StableDiffusionPipeline
Loading pipeline components...:  67%|███████████████████████████████████████████████████████████████████████████████████████████████▎                                               | 4/6 [00:26<00:15,  7.91s/it]Taking `'Attention' object has no attribute 'key'` while using `accelerate.load_checkpoint_and_dispatch` to mean C:\Users\ADMIN\.cache\huggingface\hub\models--stabilityai--stable-diffusion-2-1-base\snapshots\5ede9e4bf3e3fd1cb0ef2f7a3fff13ee514fdf06\vae was saved with deprecated attention block weight names. We will load it with the deprecated attention block names and convert them on the fly to the new attention block format. Please re-save the model after this conversion, so we don't have to do the on the fly renaming in the future. If the model is from a hub checkpoint, please also re-upload it or open a PR on the original repository.
Loading pipeline components...: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:29<00:00,  4.84s/it]
Generate an image from a text prompt
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [02:38<00:00,  3.18s/it]
D:\Ganu\AIImage\huggingface\kohya_ss\Python310\lib\site-packages\diffusers\image_processor.py:147: RuntimeWarning: invalid value encountered in cast
  images = (images * 255).round().astype("uint8")
<PIL.Image.Image image mode=RGB size=512x512 at 0x1B4099BFF70>
Save or display the generated image

John6666 · January 14, 2025, 5:06am

It’s blank, but the size of the image returned is 512x512, so in a sense the program has ended normally…
I’ll do a quick search. This is weird.

John6666 · January 14, 2025, 5:11am

I found this. try

pipeline = StableDiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-2-1-base",
    torch_dtype=torch.float32,
    device_map="balanced",
)#.to("cuda")

deicool · January 14, 2025, 5:32am

Works! Thank You so much John6666. Will get back if I have more questions.

deicool · January 15, 2025, 4:55am

Hi,

The images generated by ‘Stability Diffusion’ are original or a copy (of the data fed for training)?

John6666 · January 15, 2025, 5:58am

To begin with, there is the philosophical question of “what is a copy?”, and then there is the issue of infringement of rights when the rights holder of the original photograph or picture exists. For that reason, I think there are cases where something is considered a “legal/social” copy, but technically, on a computer, what the Diffusion model does is not a copy of data or a database search, but rather “memory recall” from a neural network.

Even if you drew a hyper-realistic picture that looked exactly like an existing work of art, and it was of the same quality as a photograph, it would still be considered a copy, but it wouldn’t be called a copy. Someone might get angry about it though.

No matter which algorithm we use, the model doesn’t have the space to store all the images used for learning.

deicool · January 15, 2025, 6:14am

That’s a insight, thank you. One more following question:

Is there a AI model which generates images from text prompt? (And the model hasn’t been trained on any real-data, something like DeepMind AlphaGo)

John6666 · January 15, 2025, 7:07am

Is there a AI model which generates images from text prompt?

I think that all of the currently popular Text-to-Image models are basically capable of doing this. (SD2.1 is also capable.)
If you want to use prompts that are closer to natural language, you can achieve this by using newer architectures such as FLUX. (In this case, it is too heavy, so I think you will have to use some kind of cloud service.)

And the model hasn’t been trained on any real-data, something like DeepMind AlphaGo

They say that for Go, they don’t need data from real professional Go players to strengthen the model anymore. In the case of Go, there is a mathematical correct answer, so that approach is possible. You just need to reduce the number of incorrect answers.

However, words, pictures and photographs are almost entirely dependent on human perception, and are a kind of illusion created by humans. Birds and insects see colors differently, and the concepts that words refer to are even more unstable. In other words, while these models may be suitable or unsuitable for a given purpose, there are not many right or wrong answers, and relying solely on mathematics is probably not a good approach. The only thing that can be guaranteed physically is shape.
In addition, there has recently been a widespread attempt to train AI models using data output by AI models that have reached a certain level of maturity (synthetic data). However, this does not mean that real-world data has been excluded.
Also, there are several models that have been trained using only data that does not raise any legal issues (Creative Commons-compliant models). There are also some in HF.

However, I don’t know of any model that doesn’t use any real-world data. If you do that, I think you could create a model that generates some kind of image that doesn’t exist in the real world (I don’t even know if humans would be able to recognize it as an image)…
Well, if you initialize the model data randomly and then train it, you might be able to get something close, but the question is how to provide the images for this without using real-world data… It might be a chicken-and-egg problem.

system · January 16, 2025, 9:30am

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Additional training of models Beginners	1	140	October 5, 2024
Theoretical Question for learning and creating images Beginners	1	35	April 11, 2025
Error while training LORA in KOHYA_SS (stabilityai/stable-diffusion-xl-base-1.0) Beginners	21	1143	February 13, 2025
kohya_SS (Output Interpretation) Intermediate	16	122	March 6, 2025
How to increase quality of fine-tuned text-to-image LoRa? 🧨 Diffusers	0	1246	November 12, 2023

Creation of Images from Text-Prompt (Customized Training)

Related topics