Creation of Images from Text-Prompt (Customized Training)

It was generated properly (though there is no LoRA)… Is it an issue with the environment, or is the library version not matching?

from diffusers import StableDiffusionPipeline, EulerDiscreteScheduler
import torch
import os
import numpy as np
from PIL import Image

# Define the path to the directory containing your model and LoRA weights
print("Define the path to the directory containing your model and LoRA weights")
model_dir = "D:\\Ganu\\AIImage\\huggingface\\kohya_ss\\kohya_ss\\trained-model\\model\\"
lora_weights_path = os.path.join(model_dir, "last.safetensors")

# Load the base model using StableDiffusionPipeline
print("Load the base model using StableDiffusionPipeline")
pipeline = StableDiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-2-1-base",
    torch_dtype=torch.float16
).to("cuda")

# Generate an image from a text prompt
print("Generate an image from a text prompt")
text_prompt = "A beautiful Woman"
pil_image = pipeline(prompt=text_prompt).images[0]

# Save or display the generated image
print("Save or display the generated image")

# Convert the NumPy array to a PIL Image and save or display the generated image 
pil_image.save("generated_image.jpg") 
pil_image.show()

# https://pytorch.org/get-started/locally/ and
pip intstall -U accelerate peft diffusers
pip install -U accelerate peft diffusers
Requirement already satisfied: accelerate in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (1.2.1)
Requirement already satisfied: peft in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (0.14.0)
Requirement already satisfied: diffusers in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (0.32.1)
Requirement already satisfied: numpy<3.0.0,>=1.17 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from accelerate) (2.2.1)
Requirement already satisfied: packaging>=20.0 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from accelerate) (24.2)
Requirement already satisfied: psutil in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from accelerate) (6.1.1)
Requirement already satisfied: pyyaml in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from accelerate) (6.0.2)
Requirement already satisfied: torch>=1.10.0 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from accelerate) (2.5.1+cu118)
Requirement already satisfied: huggingface-hub>=0.21.0 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from accelerate) (0.27.1)
Requirement already satisfied: safetensors>=0.4.3 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from accelerate) (0.5.2)
Requirement already satisfied: transformers in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from peft) (4.48.0)
Requirement already satisfied: tqdm in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from peft) (4.67.1)
Requirement already satisfied: importlib-metadata in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from diffusers) (8.5.0)
Requirement already satisfied: filelock in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from diffusers) (3.16.1)
Requirement already satisfied: regex!=2019.12.17 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from diffusers) (2024.11.6)
Requirement already satisfied: requests in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from diffusers) (2.32.3)
Requirement already satisfied: Pillow in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from diffusers) (11.1.0)
Requirement already satisfied: fsspec>=2023.5.0 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from huggingface-hub>=0.21.0->accelerate) (2024.12.0)
Requirement already satisfied: typing-extensions>=3.7.4.3 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from huggingface-hub>=0.21.0->accelerate) (4.12.2)
Requirement already satisfied: networkx in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from torch>=1.10.0->accelerate) (3.4.2)
Requirement already satisfied: jinja2 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from torch>=1.10.0->accelerate) (3.1.5)
Requirement already satisfied: sympy==1.13.1 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from torch>=1.10.0->accelerate) (1.13.1)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from sympy==1.13.1->torch>=1.10.0->accelerate) (1.3.0)
Requirement already satisfied: colorama in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from tqdm->peft) (0.4.6)
Requirement already satisfied: zipp>=3.20 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from importlib-metadata->diffusers) (3.21.0)
Requirement already satisfied: charset-normalizer<4,>=2 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from requests->diffusers) (3.4.1)
Requirement already satisfied: idna<4,>=2.5 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from requests->diffusers) (3.10)
Requirement already satisfied: urllib3<3,>=1.21.1 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from requests->diffusers) (2.3.0)
Requirement already satisfied: certifi>=2017.4.17 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from requests->diffusers) (2024.12.14)
Requirement already satisfied: tokenizers<0.22,>=0.21 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from transformers->peft) (0.21.0)
Requirement already satisfied: MarkupSafe>=2.0 in d:\ganu\aiimage\huggingface\kohya_ss\python310\lib\site-packages (from jinja2->torch>=1.10.0->accelerate) (2.1.5)
1 Like

Hmm, that doesn’t seem strange.
If I had to say, I’d say that PyTorch is suspicious, but if it’s not working, I think the black image itself won’t be generated and it will crash…
Is the safety_checker’s blackout function being triggered? Was it in 2.1 too?

Edit:
It doesn’t seem to be in 2.1.

I added the omission of β€œSafety_checker”, but it is still giving a black image:

pipeline = StableDiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-2-1-base",
    torch_dtype=torch.float16,
    safety_checker=None
).to("cuda")

I am getting the following warning:

D:\Ganu\AIImage\huggingface\kohya_ss\Python310\lib\site-packages\diffusers\image_processor.py:147: RuntimeWarning: invalid value encountered in cast
  images = (images * 255).round().astype("uint8")
1 Like
generated_image = pipeline(prompt=text_prompt).images[0]

# Handle NaN or infinite values and ensure the range is valid 
print("Handle NaN or infinite values and ensure the range is valid ")
generated_image = np.nan_to_num(generated_image, nan=0.0, posinf=255.0, neginf=0.0) 
generated_image = np.clip(generated_image, 0, 255) 
generated_image = generated_image.astype(np.uint8)

# Save or display the generated image
print("Save or display the generated image")

# Convert the NumPy array to a PIL Image and save or display the generated image 
pil_image = Image.fromarray(generated_image) 

For now, the PIL image will be returned from the pipeline, so that’s basically OK. But the problem isn’t here, is it…

pil_image = pipeline(prompt=text_prompt).images[0]

Is it because i have a OLD Lenova Legion Laptop of the make 2022? (I have updated all NVidia drivers though)

1 Like

Possibly. If it’s because of a lack of VRAM, this might work. (The missing amount is made up for with normal RAM.)

pipeline = StableDiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-2-1-base",
    torch_dtype=torch.float16,
    device_map="auto",
)#.to("cuda")

D:\Ganu\AIImage\huggingface\kohya_ss\kohya_ss\user>python John-13thJan2024-NoLora.py
Define the path to the directory containing your model and LoRA weights
Load the base model using StableDiffusionPipeline
Traceback (most recent call last):
File β€œD:\Ganu\AIImage\huggingface\kohya_ss\kohya_ss\user\John-13thJan2024-NoLora.py”, line 14, in
pipeline = StableDiffusionPipeline.from_pretrained(
File β€œD:\Ganu\AIImage\huggingface\kohya_ss\Python310\lib\site-packages\huggingface_hub\utils_validators.py”, line 114, in _inner_fn
return fn(*args, **kwargs)
File β€œD:\Ganu\AIImage\huggingface\kohya_ss\Python310\lib\site-packages\diffusers\pipelines\pipeline_utils.py”, line 710, in from_pretrained
raise NotImplementedError(
NotImplementedError: auto not supported. Supported strategies are: balanced

1 Like

How about this…

pipeline = StableDiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-2-1-base",
    torch_dtype=torch.float16,
    device_map="balanced",
)#.to("cuda")

I am still getting a blank image output

python John-13thJan2024-NoLora.py
Define the path to the directory containing your model and LoRA weights
Load the base model using StableDiffusionPipeline
Loading pipeline components...:  67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž                                               | 4/6 [00:26<00:15,  7.91s/it]Taking `'Attention' object has no attribute 'key'` while using `accelerate.load_checkpoint_and_dispatch` to mean C:\Users\ADMIN\.cache\huggingface\hub\models--stabilityai--stable-diffusion-2-1-base\snapshots\5ede9e4bf3e3fd1cb0ef2f7a3fff13ee514fdf06\vae was saved with deprecated attention block weight names. We will load it with the deprecated attention block names and convert them on the fly to the new attention block format. Please re-save the model after this conversion, so we don't have to do the on the fly renaming in the future. If the model is from a hub checkpoint, please also re-upload it or open a PR on the original repository.
Loading pipeline components...: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 6/6 [00:29<00:00,  4.84s/it]
Generate an image from a text prompt
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 50/50 [02:38<00:00,  3.18s/it]
D:\Ganu\AIImage\huggingface\kohya_ss\Python310\lib\site-packages\diffusers\image_processor.py:147: RuntimeWarning: invalid value encountered in cast
  images = (images * 255).round().astype("uint8")
<PIL.Image.Image image mode=RGB size=512x512 at 0x1B4099BFF70>
Save or display the generated image
1 Like

It’s blank, but the size of the image returned is 512x512, so in a sense the program has ended normally…
I’ll do a quick search. This is weird.:thinking:

I found this. try

pipeline = StableDiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-2-1-base",
    torch_dtype=torch.float32,
    device_map="balanced",
)#.to("cuda")
1 Like

Works! Thank You so much John6666. Will get back if I have more questions. :slight_smile:

1 Like

Hi,

The images generated by β€˜Stability Diffusion’ are original or a copy (of the data fed for training)?

1 Like

To begin with, there is the philosophical question of β€œwhat is a copy?”, and then there is the issue of infringement of rights when the rights holder of the original photograph or picture exists. For that reason, I think there are cases where something is considered a β€œlegal/social” copy, but technically, on a computer, what the Diffusion model does is not a copy of data or a database search, but rather β€œmemory recall” from a neural network.

Even if you drew a hyper-realistic picture that looked exactly like an existing work of art, and it was of the same quality as a photograph, it would still be considered a copy, but it wouldn’t be called a copy. Someone might get angry about it though.

No matter which algorithm we use, the model doesn’t have the space to store all the images used for learning.

That’s a insight, thank you. One more following question:

Is there a AI model which generates images from text prompt? (And the model hasn’t been trained on any real-data, something like DeepMind AlphaGo)

1 Like

Is there a AI model which generates images from text prompt?

I think that all of the currently popular Text-to-Image models are basically capable of doing this. (SD2.1 is also capable.)
If you want to use prompts that are closer to natural language, you can achieve this by using newer architectures such as FLUX. (In this case, it is too heavy, so I think you will have to use some kind of cloud service.:sweat_smile:)

And the model hasn’t been trained on any real-data, something like DeepMind AlphaGo

They say that for Go, they don’t need data from real professional Go players to strengthen the model anymore. In the case of Go, there is a mathematical correct answer, so that approach is possible. You just need to reduce the number of incorrect answers.

However, words, pictures and photographs are almost entirely dependent on human perception, and are a kind of illusion created by humans. Birds and insects see colors differently, and the concepts that words refer to are even more unstable. In other words, while these models may be suitable or unsuitable for a given purpose, there are not many right or wrong answers, and relying solely on mathematics is probably not a good approach. The only thing that can be guaranteed physically is shape.
In addition, there has recently been a widespread attempt to train AI models using data output by AI models that have reached a certain level of maturity (synthetic data). However, this does not mean that real-world data has been excluded.
Also, there are several models that have been trained using only data that does not raise any legal issues (Creative Commons-compliant models). There are also some in HF.

However, I don’t know of any model that doesn’t use any real-world data. If you do that, I think you could create a model that generates some kind of image that doesn’t exist in the real world (I don’t even know if humans would be able to recognize it as an image)…:roll_eyes:
Well, if you initialize the model data randomly and then train it, you might be able to get something close, but the question is how to provide the images for this without using real-world data… It might be a chicken-and-egg problem.

1 Like

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.