Late to the party, how do I handle a (NSFW) image generation model I uploaded to Vertex AI?

Yes, I know I am late to the party and I guess thats the reason why google isnt really helping to find answers.
I picked a model and hosted it on my vertex AI space. The model is running but I am having trouble finding the right docs to control it via python.

Thats the model in question and I would really appreciate a few directions to the right resources!

1 Like

Yes, it can be challenging to find the right documentation, especially when getting started with Vertex AI. Donā€™t worryā€”youā€™re in the right place to figure it out! Hereā€™s a quick rundown to help you get started:

  1. Vertex AI Python SDK:
    Youā€™ll likely need the Vertex AI Python SDK (google-cloud-aiplatform) to interact with your hosted model programmatically. Start by installing it:

    pip install google-cloud-aiplatform  
    
  2. Authentication:
    Make sure you have authenticated your Python environment with Google Cloud. If you havenā€™t already:

    gcloud auth application-default login  
    
  3. Getting Started with Deployed Models:
    Once your model is deployed, you can send prediction requests using the SDK. Hereā€™s a basic example:

    from google.cloud import aiplatform  
    
    # Initialize the client  
    aiplatform.init(project='your-project-id', location='your-region')  
    
    # Specify endpoint details  
    endpoint = aiplatform.Endpoint(endpoint_name='your-endpoint-id')  
    
    # Send a prediction  
    response = endpoint.predict(instances=[{"your-input": "value"}])  
    print(response.predictions)  
    

Happy to help further! :blush:

1 Like

Basically, this should work. I think itā€™s more difficult to install PyTorch than the code itself. I donā€™t know if it can be used with Vertex AIā€¦

pip install -U diffusers peft accelerate transformers huggingface_hub
from diffusers import DiffusionPipeline
modelname = "John6666/****"
pipe = DiffusionPipeline.from_pretrained(modelname)
prompt = "1girl"
image = pipe(prompt).images[0]

Hey guys thanks a lot for your input but I am afraid I might be completely lost and what frustrates me most is that I am usually able to resolve stuff like that myself bug this time neither google nor GPT help.
I read the vertex AI docs and multiple config tutorials for different models that are also based on SDXL but I keep getting the same few errors.

Current state:
I changed the model to LINK and successfully deployed it. The model is running and I have access to the logs. Trying to use the VertexAI model testing feature that lets you input a JSON snippet for a call always gets me the same error about not allowing strings as prompt.
My python script at least reaches the model and receives error messages and the calls show up in the models log but no matter what I do I cant seem to get it right.
I tried countless different formatting options but even with the help of o1-preview I cant get over this error:

google.api_core.exceptions.InvalidArgument: 400 {"error":"`prompt` has to be of type `str` or `list` but is <class 'dict'>"}       

I really know what a string is and please trust me I tried so many different waysā€¦ At this point I think I am missing a crucial step and I would appreciate some more help!

This is just one of the countless versions I tried:

import base64

from google.cloud import aiplatform

from google.protobuf import json_format

from google.protobuf.struct_pb2 import Value


def predict_text_to_image(

    project_id: str,

    endpoint_id: str,

    prompt: str,

    location: str = "my-basement",

    api_endpoint: str = "complicated garbage",

):

    client_options = {"api_endpoint": api_endpoint}


    client = aiplatform.gapic.PredictionServiceClient(client_options=client_options)


    endpoint = client.endpoint_path(

        project=project_id, location=location, endpoint=endpoint_id

    )


    instances = [

        {"inputs": prompt}

    ]


    # Optionally, you can set parameters (e.g., specifying image size)

    parameters = None  # or set to a dictionary if your model expects parameters


    instances = [

        json_format.ParseDict(instance, Value()) for instance in instances

    ]

    if parameters is not None:

        parameters = json_format.ParseDict(parameters, Value())


    # Make the prediction request

    response = client.predict(

        endpoint=endpoint, instances=instances, parameters=parameters

    )


    # Handle the response

    print("Response:")

    print(f" Deployed Model ID: {response.deployed_model_id}")


    # Process each prediction

    for i, prediction in enumerate(response.predictions):

        # Convert the prediction (protobuf Value) to a dictionary

        prediction_dict = json_format.MessageToDict(prediction)

        

        # Assuming the image is returned as a base64-encoded string under 'image' key

        if 'image' in prediction_dict:

            img_b64 = prediction_dict['image']

            # Decode the base64 image

            img_bytes = base64.b64decode(img_b64)

            # Save the image to a file

            image_filename = f"output_{i}.png"

            with open(image_filename, "wb") as img_file:

                img_file.write(img_bytes)

            print(f" Image saved as {image_filename}")

        else:

            print(" No image data found in the prediction response.")


if __name__ == "__main__":

    project_id = "cat-1337"

    endpoint_id = "13376969"

    location = "dreamland"



    prompt = "A cat"

    predict_text_to_image(

        project_id=project_id,

        endpoint_id=endpoint_id,

        prompt=prompt,

        location=location,

    )
1 Like

Perhaps this?

    instances = [

        {"inputs": prompt}

    ]


    # Optionally, you can set parameters (e.g., specifying image size)

    parameters = None  # or set to a dictionary if your model expects parameters


    instances = [

        json_format.ParseDict(instance, Value()) for instance in instances

    ]

to

    instances = {"inputs": prompt}

    # Optionally, you can set parameters (e.g., specifying image size)
    parameters = None  # or set to a dictionary if your model expects parameters

    instances =  json_format.ParseDict(instances, Value())

from

#You can go from Python dict or JSON string to protobuf like:

import json

from google.protobuf.json_format import Parse, ParseDict

d = {
    "first": "a string",
    "second": True,
    "third": 123456789
}

message = ParseDict(d, Thing())
# or
message = Parse(json.dumps(d), Thing())    

print(message.first)  # "a string"
print(message.second) # True
print(message.third)  # 123456789

Thanks mate!

We are finally making progress!! Messing around some more with ā€œinstancesā€ while applying your suggestions I found out that instances = prompt with prompt = "this is my prompt" finally produces something ā€¦

Instead of a picture of a test object I received some colorful static but hey, at least I finally received an image file!

I guess now it is about finding the right parameters but I ran into another issue that had me redeploy the model, which takes foreverā€¦ I increased inference steps and the model would hang while creating without any progress.

1 Like

I have spent countless hours now trying to get this to work and tried sever different SDXL models but I think I am completely lost. I dont get why there are no complete python script examples out there in the wild, for use with google cloud/vertex AI, at least I cant find any. :frowning:

1 Like

Thatā€™s true. At the very least, Gemini seems like it should know about itā€¦:sweat_smile:
There may be documentation on how to use the HF Endpoint API or Serverless Inference API.
However, I canā€™t find much when I searchā€¦
The JSON specification on the HF side can be found in the documentation and on github.

Hah, trust me, you dont want to know how many Gemini calls I had which are free at least but I also used way too many paid tokens over at OpenAIā€¦

1 Like

OK I am still getting images that look like static. Changing height/width in my call changes the received images resolution and changing inference steps 0-100 only increases the resolution of the static.

I found this error in the vertex AI logs, which is roughly the same for all of the 3 models I tried but I have no idea how to change the models config on a preconfigured model like that, any ideas?


Loading pipeline components...:   0%|          | 0/7 [00:00<?, ?it/s]
Loading pipeline components...:  14%|ā–ˆā–        | 1/7 [00:03<00:19,  3.29s/it]The config attributes {'beta_end': 0.012, 'beta_schedule': 'scaled_linear', 'beta_start': 0.00085, 'clip_sample': False, 'interpolation_type': 'linear', 'set_alpha_to_one': False, 'skip_prk_steps': True, 'steps_offset': 1, 'timestep_spacing': 'leading', 'trained_betas': None, 'use_karras_sigmas': False} were passed to EDMDPMSolverMultistepScheduler, but are not expected and will be ignored. Please verify your scheduler_config.json configuration file.

edit: this error comes up right after the model is deployed

1 Like

That error is a sampler warning that appears depending on the version of Diffusers, but it shouldnā€™t cause any real harm.
And if it does appear, itā€™s proof that the model is being read to a certain extent correctly.

Even so, the fact that the same output is always returned means that, other than the parameters or something, itā€™s generally correct, and the parameters arenā€™t being recognized. Actually, I wonder if the parameter names that Endpoint is expecting are different from the ones weā€™re using now? Iā€™ll check it out a bit.

Edit:
I canā€™t find any documentation on Text-to-Imageā€¦

Documents

Inference Endpoint

Additional parameters

This is my current code, since I switched over to POST because it is faster. I can still only get static which changes depending on my height/width/inference settings and I now discovered I am getting:
Status Code: 200

import requests
import json
import base64
from google.auth import default
from google.auth.transport.requests import Request
from PIL import Image
from io import BytesIO

# Obtain credentials
credentials, project = default()
credentials.refresh(Request())
access_token = credentials.token

# Define the endpoint URL (UPDATE FOR PRE-BUILT MODELS)
api_url = "https://pornhub.co"

# Define headers
headers = {
    "Authorization": f"Bearer {access_token}",
    "Content-Type": "application/json"
}

# Define the payload
payload = {
    "instances": ["one woman alone in room"],
    "parameters": {
        "guidance_scale": 8,
        "negative_prompt": "blurry, low quality, etc",
        "num_inference_steps": 50,
        "width": 1024,
        "height": 1024,
        "seed": 12345
    }
}

# Make the POST request
response = requests.post(api_url, headers=headers, data=json.dumps(payload))

# Print the status code (for debugging)
print("Status Code:", response.status_code)  # Keep this to see if the request was successful

# Handle the response
if response.status_code == 200:
    result = response.json()

    # Get the base64 encoded string directly from the predictions array
    image_data_base64 = result['predictions'][0]

    # Decode the base64 data
    image_data = base64.b64decode(image_data_base64)

    # Create an image from the decoded data
    image = Image.open(BytesIO(image_data))

    # Display or save the image
    image.show()  # Or: image.save("output.png")

else:
    print("Error:", response.status_code, response.text)  # Print error details if needed

I wanted to add images too but they are pretty large for just static at 500kb for a 512x512 and 2.5 MB at 1024 so here is a screenshot

(now my initial response was deleted because I wanted to be funny when censoring my API URL ā€¦ just so you know it might show up again)

This is my current code, since I switched over to POST because it is faster. I can still only get static which changes depending on my height/width/inference settings and I now discovered I am getting:
Status Code: 200

import requests
import json
import base64
from google.auth import default
from google.auth.transport.requests import Request
from PIL import Image
from io import BytesIO

# Obtain credentials
credentials, project = default()
credentials.refresh(Request())
access_token = credentials.token

# Define the endpoint URL (UPDATE FOR PRE-BUILT MODELS)
api_url = "hts/no - funny - URL .co"

# Define headers
headers = {
    "Authorization": f"Bearer {access_token}",
    "Content-Type": "application/json"
}

# Define the payload
payload = {
    "instances": ["one woman alone in room"],
    "parameters": {
        "guidance_scale": 8,
        "negative_prompt": "blurry, low quality, etc",
        "num_inference_steps": 50,
        "width": 1024,
        "height": 1024,
        "seed": 12345
    }
}

# Make the POST request
response = requests.post(api_url, headers=headers, data=json.dumps(payload))

# Print the status code (for debugging)
print("Status Code:", response.status_code)  # Keep this to see if the request was successful

# Handle the response
if response.status_code == 200:
    result = response.json()

    # Get the base64 encoded string directly from the predictions array
    image_data_base64 = result['predictions'][0]

    # Decode the base64 data
    image_data = base64.b64decode(image_data_base64)

    # Create an image from the decoded data
    image = Image.open(BytesIO(image_data))

    # Display or save the image
    image.show()  # Or: image.save("output.png")

else:
    print("Error:", response.status_code, response.text)  # Print error details if needed

I wanted to add images too but they are pretty large for just static at 500kb for a 512x512 and 2.5 MB at 1024 so here is a screenshot

1 Like

What I dont get is that most examples show:

{
  "inputs": "Hugging Face, the winner of VentureBeatā€™s Innovation in Natural Language Process/Understanding Award for 2021, is looking to level the playing field. The team, launched by ClĆ©ment Delangue and Julien Chaumond in 2016, was recognized for its work in democratizing NLP, the global market value for which is expected to hit $35.1 billion by 2026. This week, Googleā€™s former head of Ethical AI Margaret Mitchell joined the team.",
  "parameters": {
    "repetition_penalty": 4.0,
    "max_length": 128
  }
}

but

"inputs": "my prompt text",

always leads to the error:

Response Text: {"error":"`prompt` has to be of type `str` or `list` but is <class 'dict'>"}
Error: 400 {"error":"`prompt` has to be of type `str` or `list` but is <class 'dict'>"}

an neither gemini nor GPT are able to find a solution

1 Like

In architectures such as StableDiffusion, noise is prepared and gradually removed to generate an image in the end, but it looks like noise in the initial stage. Is it possible that an image in the middle of being generated is being returned, or are multiple images being returned and the image in the middle of being generated is being referenced?

Also, in the SDXL architecture, unless otherwise specified, an image of 1024x1024 should be returned. In the case of SD1.5, 512x512 is often the case.

By the way, the operation method is the same for the model I uploaded, but I thought it might be better to use a more famous model for the operation test.

goodnight

First off, please know I really appreciate the time you are taking here! Usually I dont ask many questions of forums because I am bothered by people letting other do the troubleshooting for them but I already spent days trying to get this to work :frowning:

I already tried 3 different models but I am deploying the one you linked right now.

According to the little information the logs on Vertex AI provide I dont think there is an issue like the one you described. I only send one request after the other and the log output suggests that it starts processing, finished processing and outputs one image at a time. Sadly the logs dont state what the prompt is that was received.

I will report back when the new model is running.

1 Like

I finally have it working !!!

This MODEL works for me out of the box.

I added a very simple GUI to the python script, which also lets you save and load your inputs, just to make it a bit more user friendly and ofc I am sharing it with you:

import requests
import json
import base64
from google.auth import default
from google.auth.transport.requests import Request
from PIL import Image
from io import BytesIO
import tkinter as tk
from tkinter import ttk, scrolledtext, Scale, HORIZONTAL, messagebox
import configparser
import os
import datetime

# --- Configuration File ---
CONFIG_FILE = "config.ini"
config = configparser.ConfigParser()

# --- Image Output Directory ---
IMG_DIR = "img"
os.makedirs(IMG_DIR, exist_ok=True)

class InferenceOutputError(Exception):
    def __init__(self, message):
        self.message = message

def text_to_image(api_url: str, access_token: str, inputs: str, parameters: dict = None, options: dict = None) -> bytes:
    """
    Generates an image from text using a Vertex AI endpoint.
    """
    headers = {
        "Authorization": f"Bearer {access_token}",
        "Content-Type": "application/json"
    }
    payload = {
        "instances": [inputs],
        "parameters": parameters or {}
    }
    response = requests.post(api_url, headers=headers, data=json.dumps(payload))
    print("Status Code:", response.status_code)
    if response.status_code == 200:
        result = response.json()
        print("Result:", result)
        image_data_base64 = result['predictions'][0]
        image_data = base64.b64decode(image_data_base64)
        if isinstance(image_data, bytes) and len(image_data) > 0:
            return image_data
        else:
            raise InferenceOutputError("Expected bytes (image data)")
    else:
        print("Error:", response.status_code, response.text)
        raise InferenceOutputError(f"HTTP Error: {response.status_code}")

def load_config():
    """Loads configuration from the config file."""
    config.read(CONFIG_FILE)
    if "settings" in config:
        prompt_text.insert(tk.END, config.get("settings", "prompt", fallback=""))
        negative_prompt_text.insert(tk.END, config.get("settings", "negative_prompt", fallback=""))
        inference_steps_slider.set(config.getint("settings", "inference_steps", fallback=30))
        guidance_scale_slider.set(config.getint("settings", "guidance_scale", fallback=8))
        resolution_var.set(config.get("settings", "resolution", fallback="512x512"))
    else:
        # Set default values if the config file does not have settings
        inference_steps_slider.set(30)
        guidance_scale_slider.set(5)
        resolution_var.set("512x512")

def save_config():
    """Saves the current input values to the config file."""
    config["settings"] = {
        "prompt": prompt_text.get("1.0", tk.END).strip(),
        "negative_prompt": negative_prompt_text.get("1.0", tk.END).strip(),
        "inference_steps": inference_steps_slider.get(),
        "guidance_scale": guidance_scale_slider.get(),
        "resolution": resolution_var.get(),
    }
    with open(CONFIG_FILE, "w") as configfile:
        config.write(configfile)
    messagebox.showinfo("Info", "Configuration saved!")

def generate_image():
    global access_token

    inputs = prompt_text.get("1.0", tk.END).strip()
    negative_prompt = negative_prompt_text.get("1.0", tk.END).strip()
    try:
        num_inference_steps = int(inference_steps_slider.get())
        guidance_scale = int(guidance_scale_slider.get())
    except ValueError:
        messagebox.showerror("Error", "Inference steps and guidance scale must be integers.")
        return

    resolution = resolution_var.get()
    width, height = map(int, resolution.split("x"))

    parameters = {
        "negative_prompt": negative_prompt,
        "num_inference_steps": num_inference_steps,
        "guidance_scale": guidance_scale,
        "width": width,
        "height": height,
        "seed": 12345
    }

    try:
        image_bytes = text_to_image(api_url, access_token, inputs, parameters)

        # --- Save Image ---
        timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
        image_filename = f"image_{timestamp}.png"
        image_path = os.path.join(IMG_DIR, image_filename)

        image = Image.open(BytesIO(image_bytes))
        image.save(image_path)
        print(f"Image saved to: {image_path}")

        # --- Display Image ---
        image.show()

    except InferenceOutputError as e:
        messagebox.showerror("Error", f"Inference Error: {e.message}")
    except Exception as e:
        messagebox.showerror("Error", f"An unexpected error occurred: {e}")

# --- GUI Setup ---
window = tk.Tk()
window.title("Stable Diffusion Image Generator")

# --- Credentials ---
# Use Application Default Credentials (ADC)
credentials, project = default()
credentials.refresh(Request())
access_token = credentials.token

# --- API URL ---
# Replace with your endpoint URL
api_url = "REPLACE_WITH_YOUR_ENDPOINT_URL"

# --- Prompt ---
prompt_label = ttk.Label(window, text="Prompt:")
prompt_label.grid(column=0, row=0, sticky=tk.W, padx=5, pady=5)
prompt_text = scrolledtext.ScrolledText(window, wrap=tk.WORD, height=5)
prompt_text.grid(column=0, row=1, padx=5, pady=5, columnspan=2, sticky="nsew")

# --- Negative Prompt ---
negative_prompt_label = ttk.Label(window, text="Negative Prompt:")
negative_prompt_label.grid(column=0, row=2, sticky=tk.W, padx=5, pady=5)
negative_prompt_text = scrolledtext.ScrolledText(window, wrap=tk.WORD, height=5)
negative_prompt_text.grid(column=0, row=3, padx=5, pady=5, columnspan=2, sticky="nsew")

# --- Inference Steps ---
inference_steps_label = ttk.Label(window, text="Inference Steps (0-200):")
inference_steps_label.grid(column=0, row=4, sticky=tk.W, padx=5, pady=5)
inference_steps_slider = Scale(window, from_=0, to=200, orient=HORIZONTAL)
inference_steps_slider.set(30) # Default value
inference_steps_slider.grid(column=1, row=4, padx=5, pady=5, sticky="ew")

# --- Resolution ---
resolution_label = ttk.Label(window, text="Resolution:")
resolution_label.grid(column=0, row=5, sticky=tk.W, padx=5, pady=5)
resolution_var = tk.StringVar(window)
resolutions = ["512x512", "768x512", "512x768", "1024x768", "768x1024", "1216x832", "832x1216", "1024x1024"]
resolution_var.set(resolutions[0]) # Default value
resolution_dropdown = ttk.Combobox(window, textvariable=resolution_var, values=resolutions)
resolution_dropdown.grid(column=1, row=5, padx=5, pady=5, sticky="ew")

# --- Guidance Scale ---
guidance_scale_label = ttk.Label(window, text="Guidance Scale (0-20):")
guidance_scale_label.grid(column=0, row=6, sticky=tk.W, padx=5, pady=5)
guidance_scale_slider = Scale(window, from_=0, to=20, orient=HORIZONTAL)
guidance_scale_slider.set(5) # Default value
guidance_scale_slider.grid(column=1, row=6, padx=5, pady=5, sticky="ew")

# --- Load Configuration ---
load_button = ttk.Button(window, text="Load Configuration", command=load_config)
load_button.grid(column=0, row=8, padx=5, pady=10)

# --- Save Configuration ---
save_button = ttk.Button(window, text="Save Configuration", command=save_config)
save_button.grid(column=1, row=8, padx=5, pady=10)

# --- Generate Button ---
generate_button = ttk.Button(window, text="Generate Image", command=generate_image)
generate_button.grid(column=0, row=7, padx=5, pady=10, columnspan=2)

# --- Configure grid weights ---
window.columnconfigure(0, weight=1)
window.columnconfigure(1, weight=1)
for i in range(9):
    window.rowconfigure(i, weight=1)

# --- Load initial configuration ---
load_config()

# --- Run the GUI ---
window.mainloop()

@John6666 please know that I am deeply grateful for your help getting this to run!!

Since I was not able to find any resources on this secific set of issues I hope it is okay to do some search engine optimization:
Vertex AI Stable Diffusion XL inference error, Python Vertex AI endpoint huggingface issues, SDXL Vertex AI deployment troubleshooting, Stable Diffusion XL prompt formatting Vertex AI, Vertex AI image generation errors, Python requests to Vertex AI endpoint, Base64 decoding error Vertex AI, Vertex AI prediction response format, Stable Diffusion XL custom container handling, Authentication error Vertex AI, HTTP 400 error Vertex AI, TypeError: string indices must be integers Vertex AI, ā€œcan only concatenate tuple (not ā€œdictā€) to tupleā€ Vertex AI error, Vertex AI prediction timeout, Vertex AI SDK Stable Diffusion XL example, google-cloud-aiplatform Stable Diffusion, diffusers on Vertex AI, Stable Diffusion XL inference request Python, Vertex AI endpoint input format, Vertex AI endpoint output format, Troubleshooting Vertex AI online prediction, Vertex AI custom model input validation, Vertex AI pre-built container Stable Diffusion, Vertex AI prediction request payload, Vertex AI Python client library, GUI for stable diffusion model Vertex AI, Vertex AI Stable Diffusion XL tutorial, Vertex AI Stable Diffusion XL guide, Configuring Vertex AI endpoint for Stable Diffusion XL, Vertex AI model expects string or list, Vertex AI model expects dictionary, InferenceOutputError Vertex AI, TypeError Vertex AI, HTTP Error Vertex AI, image generation from Vertex AI returning static

1 Like

!!!

using all the same inputs!
guess I`ll have to find a NSFW model that works out of the box like this!

Thank you very much for the help!

1 Like