Program not working on GPU but works on CPU

deicool · May 15, 2025, 9:06am

Hello

Getting same black image without LORA

Python & Torch Details:

(venv) D:\Ganu\AIImage\project\Train-10Images-chatgptParameters\runs\1sstrun-23thApril2025\generation\1stGo>python
Python 3.10.10 (tags/v3.10.10:aad5f6a, Feb  7 2023, 17:20:36) [MSC v.1929 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>>
>>> # Check if torch was built with CUDA
>>> print("Is CUDA available? :", torch.cuda.is_available())
Is CUDA available? : True
>>> print("CUDA version (torch compiled with):", torch.version.cuda)
CUDA version (torch compiled with): 11.8
>>> print("Torch built with CUDA support:", torch.backends.cuda.is_built())
Torch built with CUDA support: True

Code:

import logging
from diffusers import AutoPipelineForText2Image, AutoencoderKL
import torch
import numpy as np
import random
import os
from PIL import Image

# =========================
# STEP 0: Logging Setup
# =========================
log_file = "generation_log.txt"
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(levelname)s - %(message)s',
    handlers=[
        logging.FileHandler(log_file),
        logging.StreamHandler()
    ]
)

logging.info("Initializing...")

# =========================
# STEP 1: Environment Setup
# =========================

torch.cuda.empty_cache()
torch.cuda.ipc_collect()

seed = random.randint(0, 9999999)
torch.manual_seed(seed)
np.random.seed(seed)
logging.info(f"Using seed: {seed}")

# ===============================
# STEP 2: Model and LoRA Setup
# ===============================
logging.info("Loading base model and LoRA weights...")

model_dir = "D:\\Ganu\\AIImage\\huggingface\\kohya_ss\\kohya_ss\\outputs"
lora_weights_path = os.path.join(model_dir, "model")
model_id = "stabilityai/stable-diffusion-xl-base-1.0"

# Optional: Custom VAE (uncomment if needed)
vae = AutoencoderKL.from_pretrained(
    "madebyollin/sdxl-vae-fp16-fix",
    torch_dtype=torch.float16
).to("cuda")

try:
    pipeline = AutoPipelineForText2Image.from_pretrained(
        model_id,
        torch_dtype=torch.float16,
        variant="fp16",
    ).to("cuda")
    logging.info("Pipeline loaded to GPU with float16.")
except Exception as e:
    logging.error(f"Failed to load model pipeline: {e}")
    raise

#pipeline.enable_model_cpu_offload()

# If using VAE:
pipeline.vae = vae

pipeline.enable_attention_slicing()
pipeline.enable_vae_slicing()

"""try:
    pipeline.load_lora_weights(lora_weights_path, weight_name="last.safetensors")
    logging.info("LoRA weights loaded successfully.")
except ValueError as e:
    logging.error("Invalid LoRA checkpoint. Check the format or compatibility.")
    raise e
"""

# =========================
# STEP 3: Prompt Inference
# =========================
text_prompt = (
    "A wide, breathtaking landscape with all real vibrant nature-themed background, lush forests, mountains, and a Doctor standing prominently in the foreground"
)

negative_prompt = (
    "text, letters, words, signage, logos, labels, writing, messy background, busy layout, clutter, double faces, abstract shapes, UI panels with words, overlapping elements, header, footer, top bar, navigation bar, bottom menu, toolbar, top text, website layout, browser frame, button row, page border, UI bar"
)

logging.info(f"Running inference with prompt: {text_prompt}")

try:
    result = pipeline(
        prompt=text_prompt,
        negative_prompt=negative_prompt,
        guidance_scale=7.5,
        num_inference_steps=30
    )
    generated_image = result.images[0]
    output_path = f"generated_image_{seed}.png"
    generated_image.save(output_path)
    logging.info(f"Image saved to: {output_path}")
    generated_image.show()
except Exception as e:
    logging.error(f"Error during image generation: {e}")
    raise

Environment:

1. Libraries

pip list
Package            Version
------------------ ------------------
accelerate         0.21.0
aiofiles           24.1.0
annotated-types    0.7.0
anyio              4.9.0
certifi            2025.1.31
charset-normalizer 3.4.1
click              8.1.8
colorama           0.4.6
deepspeed          0.10.0+f5c834a6
diffusers          0.21.4
exceptiongroup     1.2.2
fastapi            0.115.12
ffmpy              0.5.0
filelock           3.18.0
flash-attention    1.0.0
fsspec             2025.3.2
gradio             5.27.1
gradio_client      1.9.1
groovy             0.1.2
h11                0.16.0
hjson              3.1.0
httpcore           1.0.9
httpx              0.28.1
huggingface-hub    0.16.4
idna               3.10
importlib_metadata 8.6.1
Jinja2             3.1.6
markdown-it-py     3.0.0
MarkupSafe         3.0.2
mdurl              0.1.2
mpmath             1.3.0
mypy_extensions    1.1.0
networkx           3.4.2
ninja              1.11.1.4
numpy              1.23.1
orjson             3.10.16
packaging          25.0
pandas             2.2.3
peft               0.15.2
pillow             11.2.1
pip                25.1.1
psutil             7.0.0
py-cpuinfo         9.0.0
pydantic           1.10.13
pydantic_core      2.33.1
pydub              0.25.1
Pygments           2.19.1
pyre-extensions    0.0.29
python-dateutil    2.9.0.post0
python-multipart   0.0.20
pytz               2025.2
PyYAML             6.0.2
regex              2024.11.6
requests           2.32.3
rich               14.0.0
ruff               0.11.7
safehttpx          0.1.6
safetensors        0.5.3
semantic-version   2.10.0
setuptools         65.5.0
shellingham        1.5.4
six                1.17.0
sniffio            1.3.1
starlette          0.46.2
sympy              1.14.0
tokenizers         0.13.3
tomlkit            0.13.2
torch              2.4.0+cu118
torchaudio         2.7.0+cu118
torchvision        0.22.0+cu118
tqdm               4.67.1
transformers       4.31.0
typer              0.15.3
typing_extensions  4.13.2
typing-inspect     0.9.0
typing-inspection  0.4.0
tzdata             2025.2
urllib3            2.4.0
uvicorn            0.34.2
websockets         15.0.1
xformers           0.0.27.post2+cu118
zipp               3.21.0

nvdia-sim output

(venv) D:\Ganu\AIImage\project\Train-10Images-chatgptParameters\runs\1sstrun-23thApril2025\generation\1stGo>nvidia-smi
Thu May 15 14:36:50 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 576.40                 Driver Version: 576.40         CUDA Version: 12.9     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                  Driver-Model | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 1650      WDDM  |   00000000:01:00.0  On |                  N/A |
| N/A   47C    P8              5W /   50W |     698MiB /   4096MiB |      3%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A            5648    C+G   ...IA app\CEF\NVIDIA Overlay.exe      N/A      |
|    0   N/A  N/A            6664    C+G   ...we\Microsoft.Media.Player.exe      N/A      |
|    0   N/A  N/A            7348    C+G   ...Chrome\Application\chrome.exe      N/A      |
|    0   N/A  N/A            7560    C+G   ...Chrome\Application\chrome.exe      N/A      |
|    0   N/A  N/A            7940    C+G   ....0.3240.64\msedgewebview2.exe      N/A      |
|    0   N/A  N/A            9104    C+G   C:\Windows\explorer.exe               N/A      |
|    0   N/A  N/A            9520    C+G   ...h_cw5n1h2txyewy\SearchApp.exe      N/A      |
|    0   N/A  N/A            9728    C+G   ...ntrolPanel\SystemSettings.exe      N/A      |
|    0   N/A  N/A           11968    C+G   ...h_cw5n1h2txyewy\SearchApp.exe      N/A      |
|    0   N/A  N/A           14064    C+G   ...5n1h2txyewy\TextInputHost.exe      N/A      |
|    0   N/A  N/A           15292    C+G   ...IA app\CEF\NVIDIA Overlay.exe      N/A      |
+-----------------------------------------------------------------------------------------+

Topic		Replies	Views
Error while training LORA in KOHYA_SS (stabilityai/stable-diffusion-xl-base-1.0) Beginners	21	2217	February 13, 2025
Creation of Images from Text-Prompt (Customized Training) Beginners	37	773	January 15, 2025
How long does image generation with black-forest-labs/FLUX.1-dev take? Models	4	135	July 22, 2025
Running SDXL diffusers in a container on python running ubuntu 2204, system RAM not being released Intermediate	0	1015	November 27, 2023
Want my Flux LoRa model to work and also want to be able to train my own SD 1.5 and SDXL model Beginners	4	431	October 28, 2025

Program not working on GPU but works on CPU

Related topics