Late to the party, how do I handle a (NSFW) image generation model I uploaded to Vertex AI?

dakazze · January 7, 2025, 11:06am

Yes, I know I am late to the party and I guess thats the reason why google isnt really helping to find answers.
I picked a model and hosted it on my vertex AI space. The model is running but I am having trouble finding the right docs to control it via python.

Thats the model in question and I would really appreciate a few directions to the right resources!

Alanturner2 · January 7, 2025, 1:14pm

Yes, it can be challenging to find the right documentation, especially when getting started with Vertex AI. Don’t worry—you’re in the right place to figure it out! Here’s a quick rundown to help you get started:

Vertex AI Python SDK:
You’ll likely need the Vertex AI Python SDK (google-cloud-aiplatform) to interact with your hosted model programmatically. Start by installing it:
```
pip install google-cloud-aiplatform  
```
Authentication:
Make sure you have authenticated your Python environment with Google Cloud. If you haven’t already:
```
gcloud auth application-default login  
```

Getting Started with Deployed Models:
Once your model is deployed, you can send prediction requests using the SDK. Here’s a basic example:

from google.cloud import aiplatform  

# Initialize the client  
aiplatform.init(project='your-project-id', location='your-region')  

# Specify endpoint details  
endpoint = aiplatform.Endpoint(endpoint_name='your-endpoint-id')  

# Send a prediction  
response = endpoint.predict(instances=[{"your-input": "value"}])  
print(response.predictions)

Happy to help further!

John6666 · January 7, 2025, 1:15pm

Basically, this should work. I think it’s more difficult to install PyTorch than the code itself. I don’t know if it can be used with Vertex AI…

pip install -U diffusers peft accelerate transformers huggingface_hub

from diffusers import DiffusionPipeline
modelname = "John6666/****"
pipe = DiffusionPipeline.from_pretrained(modelname)
prompt = "1girl"
image = pipe(prompt).images[0]

dakazze · January 7, 2025, 10:00pm

Hey guys thanks a lot for your input but I am afraid I might be completely lost and what frustrates me most is that I am usually able to resolve stuff like that myself bug this time neither google nor GPT help.
I read the vertex AI docs and multiple config tutorials for different models that are also based on SDXL but I keep getting the same few errors.

Current state:
I changed the model to LINK and successfully deployed it. The model is running and I have access to the logs. Trying to use the VertexAI model testing feature that lets you input a JSON snippet for a call always gets me the same error about not allowing strings as prompt.
My python script at least reaches the model and receives error messages and the calls show up in the models log but no matter what I do I cant seem to get it right.
I tried countless different formatting options but even with the help of o1-preview I cant get over this error:

google.api_core.exceptions.InvalidArgument: 400 {"error":"`prompt` has to be of type `str` or `list` but is <class 'dict'>"}

I really know what a string is and please trust me I tried so many different ways… At this point I think I am missing a crucial step and I would appreciate some more help!

This is just one of the countless versions I tried:

import base64

from google.cloud import aiplatform

from google.protobuf import json_format

from google.protobuf.struct_pb2 import Value


def predict_text_to_image(

    project_id: str,

    endpoint_id: str,

    prompt: str,

    location: str = "my-basement",

    api_endpoint: str = "complicated garbage",

):

    client_options = {"api_endpoint": api_endpoint}


    client = aiplatform.gapic.PredictionServiceClient(client_options=client_options)


    endpoint = client.endpoint_path(

        project=project_id, location=location, endpoint=endpoint_id

    )


    instances = [

        {"inputs": prompt}

    ]


    # Optionally, you can set parameters (e.g., specifying image size)

    parameters = None  # or set to a dictionary if your model expects parameters


    instances = [

        json_format.ParseDict(instance, Value()) for instance in instances

    ]

    if parameters is not None:

        parameters = json_format.ParseDict(parameters, Value())


    # Make the prediction request

    response = client.predict(

        endpoint=endpoint, instances=instances, parameters=parameters

    )


    # Handle the response

    print("Response:")

    print(f" Deployed Model ID: {response.deployed_model_id}")


    # Process each prediction

    for i, prediction in enumerate(response.predictions):

        # Convert the prediction (protobuf Value) to a dictionary

        prediction_dict = json_format.MessageToDict(prediction)

        

        # Assuming the image is returned as a base64-encoded string under 'image' key

        if 'image' in prediction_dict:

            img_b64 = prediction_dict['image']

            # Decode the base64 image

            img_bytes = base64.b64decode(img_b64)

            # Save the image to a file

            image_filename = f"output_{i}.png"

            with open(image_filename, "wb") as img_file:

                img_file.write(img_bytes)

            print(f" Image saved as {image_filename}")

        else:

            print(" No image data found in the prediction response.")


if __name__ == "__main__":

    project_id = "cat-1337"

    endpoint_id = "13376969"

    location = "dreamland"



    prompt = "A cat"

    predict_text_to_image(

        project_id=project_id,

        endpoint_id=endpoint_id,

        prompt=prompt,

        location=location,

    )

John6666 · January 7, 2025, 11:32pm

Perhaps this?

    instances = [

        {"inputs": prompt}

    ]


    # Optionally, you can set parameters (e.g., specifying image size)

    parameters = None  # or set to a dictionary if your model expects parameters


    instances = [

        json_format.ParseDict(instance, Value()) for instance in instances

    ]

to

    instances = {"inputs": prompt}

    # Optionally, you can set parameters (e.g., specifying image size)
    parameters = None  # or set to a dictionary if your model expects parameters

    instances =  json_format.ParseDict(instances, Value())

from

#You can go from Python dict or JSON string to protobuf like:

import json

from google.protobuf.json_format import Parse, ParseDict

d = {
    "first": "a string",
    "second": True,
    "third": 123456789
}

message = ParseDict(d, Thing())
# or
message = Parse(json.dumps(d), Thing())    

print(message.first)  # "a string"
print(message.second) # True
print(message.third)  # 123456789

dakazze · January 8, 2025, 12:41am

Thanks mate!

We are finally making progress!! Messing around some more with “instances” while applying your suggestions I found out that instances = prompt with prompt = "this is my prompt" finally produces something …

Instead of a picture of a test object I received some colorful static but hey, at least I finally received an image file!

I guess now it is about finding the right parameters but I ran into another issue that had me redeploy the model, which takes forever… I increased inference steps and the model would hang while creating without any progress.

dakazze · January 8, 2025, 4:54am

I have spent countless hours now trying to get this to work and tried sever different SDXL models but I think I am completely lost. I dont get why there are no complete python script examples out there in the wild, for use with google cloud/vertex AI, at least I cant find any.

John6666 · January 8, 2025, 5:17am

That’s true. At the very least, Gemini seems like it should know about it…
There may be documentation on how to use the HF Endpoint API or Serverless Inference API.
However, I can’t find much when I search…
The JSON specification on the HF side can be found in the documentation and on github.

dakazze · January 8, 2025, 6:09am

Hah, trust me, you dont want to know how many Gemini calls I had which are free at least but I also used way too many paid tokens over at OpenAI…

dakazze · January 8, 2025, 2:49pm

OK I am still getting images that look like static. Changing height/width in my call changes the received images resolution and changing inference steps 0-100 only increases the resolution of the static.

I found this error in the vertex AI logs, which is roughly the same for all of the 3 models I tried but I have no idea how to change the models config on a preconfigured model like that, any ideas?


Loading pipeline components...:   0%|          | 0/7 [00:00<?, ?it/s]
Loading pipeline components...:  14%|█▍        | 1/7 [00:03<00:19,  3.29s/it]The config attributes {'beta_end': 0.012, 'beta_schedule': 'scaled_linear', 'beta_start': 0.00085, 'clip_sample': False, 'interpolation_type': 'linear', 'set_alpha_to_one': False, 'skip_prk_steps': True, 'steps_offset': 1, 'timestep_spacing': 'leading', 'trained_betas': None, 'use_karras_sigmas': False} were passed to EDMDPMSolverMultistepScheduler, but are not expected and will be ignored. Please verify your scheduler_config.json configuration file.

edit: this error comes up right after the model is deployed

John6666 · January 8, 2025, 3:42pm

That error is a sampler warning that appears depending on the version of Diffusers, but it shouldn’t cause any real harm.
And if it does appear, it’s proof that the model is being read to a certain extent correctly.

Even so, the fact that the same output is always returned means that, other than the parameters or something, it’s generally correct, and the parameters aren’t being recognized. Actually, I wonder if the parameter names that Endpoint is expecting are different from the ones we’re using now? I’ll check it out a bit.

Edit:
I can’t find any documentation on Text-to-Image…

Documents

Inference Endpoint

Additional parameters

github.com

huggingface/huggingface.js/blob/main/packages/inference/src/tasks/cv/textToImage.ts

import { InferenceOutputError } from "../../lib/InferenceOutputError";
import type { BaseArgs, Options } from "../../types";
import { request } from "../custom/request";

export type TextToImageArgs = BaseArgs & {
	/**
	 * The text to generate an image from
	 */
	inputs: string;

	parameters?: {
		/**
		 * An optional negative prompt for the image generation
		 */
		negative_prompt?: string;
		/**
		 * The height in pixels of the generated image
		 */
		height?: number;
		/**

This file has been truncated. show original

dakazze · January 8, 2025, 4:19pm

This is my current code, since I switched over to POST because it is faster. I can still only get static which changes depending on my height/width/inference settings and I now discovered I am getting:
Status Code: 200

import requests
import json
import base64
from google.auth import default
from google.auth.transport.requests import Request
from PIL import Image
from io import BytesIO

# Obtain credentials
credentials, project = default()
credentials.refresh(Request())
access_token = credentials.token

# Define the endpoint URL (UPDATE FOR PRE-BUILT MODELS)
api_url = "https://pornhub.co"

# Define headers
headers = {
    "Authorization": f"Bearer {access_token}",
    "Content-Type": "application/json"
}

# Define the payload
payload = {
    "instances": ["one woman alone in room"],
    "parameters": {
        "guidance_scale": 8,
        "negative_prompt": "blurry, low quality, etc",
        "num_inference_steps": 50,
        "width": 1024,
        "height": 1024,
        "seed": 12345
    }
}

# Make the POST request
response = requests.post(api_url, headers=headers, data=json.dumps(payload))

# Print the status code (for debugging)
print("Status Code:", response.status_code)  # Keep this to see if the request was successful

# Handle the response
if response.status_code == 200:
    result = response.json()

    # Get the base64 encoded string directly from the predictions array
    image_data_base64 = result['predictions'][0]

    # Decode the base64 data
    image_data = base64.b64decode(image_data_base64)

    # Create an image from the decoded data
    image = Image.open(BytesIO(image_data))

    # Display or save the image
    image.show()  # Or: image.save("output.png")

else:
    print("Error:", response.status_code, response.text)  # Print error details if needed

I wanted to add images too but they are pretty large for just static at 500kb for a 512x512 and 2.5 MB at 1024 so here is a screenshot

dakazze · January 8, 2025, 4:22pm

(now my initial response was deleted because I wanted to be funny when censoring my API URL … just so you know it might show up again)

This is my current code, since I switched over to POST because it is faster. I can still only get static which changes depending on my height/width/inference settings and I now discovered I am getting:
Status Code: 200

import requests
import json
import base64
from google.auth import default
from google.auth.transport.requests import Request
from PIL import Image
from io import BytesIO

# Obtain credentials
credentials, project = default()
credentials.refresh(Request())
access_token = credentials.token

# Define the endpoint URL (UPDATE FOR PRE-BUILT MODELS)
api_url = "hts/no - funny - URL .co"

# Define headers
headers = {
    "Authorization": f"Bearer {access_token}",
    "Content-Type": "application/json"
}

# Define the payload
payload = {
    "instances": ["one woman alone in room"],
    "parameters": {
        "guidance_scale": 8,
        "negative_prompt": "blurry, low quality, etc",
        "num_inference_steps": 50,
        "width": 1024,
        "height": 1024,
        "seed": 12345
    }
}

# Make the POST request
response = requests.post(api_url, headers=headers, data=json.dumps(payload))

# Print the status code (for debugging)
print("Status Code:", response.status_code)  # Keep this to see if the request was successful

# Handle the response
if response.status_code == 200:
    result = response.json()

    # Get the base64 encoded string directly from the predictions array
    image_data_base64 = result['predictions'][0]

    # Decode the base64 data
    image_data = base64.b64decode(image_data_base64)

    # Create an image from the decoded data
    image = Image.open(BytesIO(image_data))

    # Display or save the image
    image.show()  # Or: image.save("output.png")

else:
    print("Error:", response.status_code, response.text)  # Print error details if needed

I wanted to add images too but they are pretty large for just static at 500kb for a 512x512 and 2.5 MB at 1024 so here is a screenshot

dakazze · January 8, 2025, 4:48pm

What I dont get is that most examples show:

{
  "inputs": "Hugging Face, the winner of VentureBeat’s Innovation in Natural Language Process/Understanding Award for 2021, is looking to level the playing field. The team, launched by Clément Delangue and Julien Chaumond in 2016, was recognized for its work in democratizing NLP, the global market value for which is expected to hit $35.1 billion by 2026. This week, Google’s former head of Ethical AI Margaret Mitchell joined the team.",
  "parameters": {
    "repetition_penalty": 4.0,
    "max_length": 128
  }
}

but

"inputs": "my prompt text",

always leads to the error:

Response Text: {"error":"`prompt` has to be of type `str` or `list` but is <class 'dict'>"}
Error: 400 {"error":"`prompt` has to be of type `str` or `list` but is <class 'dict'>"}

an neither gemini nor GPT are able to find a solution

John6666 · January 8, 2025, 5:06pm

In architectures such as StableDiffusion, noise is prepared and gradually removed to generate an image in the end, but it looks like noise in the initial stage. Is it possible that an image in the middle of being generated is being returned, or are multiple images being returned and the image in the middle of being generated is being referenced?

Also, in the SDXL architecture, unless otherwise specified, an image of 1024x1024 should be returned. In the case of SD1.5, 512x512 is often the case.

By the way, the operation method is the same for the model I uploaded, but I thought it might be better to use a more famous model for the operation test.

goodnight

dakazze · January 8, 2025, 5:15pm

First off, please know I really appreciate the time you are taking here! Usually I dont ask many questions of forums because I am bothered by people letting other do the troubleshooting for them but I already spent days trying to get this to work

I already tried 3 different models but I am deploying the one you linked right now.

According to the little information the logs on Vertex AI provide I dont think there is an issue like the one you described. I only send one request after the other and the log output suggests that it starts processing, finished processing and outputs one image at a time. Sadly the logs dont state what the prompt is that was received.

I will report back when the new model is running.

dakazze · January 9, 2025, 1:55pm

I finally have it working !!!

This MODEL works for me out of the box.

I added a very simple GUI to the python script, which also lets you save and load your inputs, just to make it a bit more user friendly and ofc I am sharing it with you:

import requests
import json
import base64
from google.auth import default
from google.auth.transport.requests import Request
from PIL import Image
from io import BytesIO
import tkinter as tk
from tkinter import ttk, scrolledtext, Scale, HORIZONTAL, messagebox
import configparser
import os
import datetime

# --- Configuration File ---
CONFIG_FILE = "config.ini"
config = configparser.ConfigParser()

# --- Image Output Directory ---
IMG_DIR = "img"
os.makedirs(IMG_DIR, exist_ok=True)

class InferenceOutputError(Exception):
    def __init__(self, message):
        self.message = message

def text_to_image(api_url: str, access_token: str, inputs: str, parameters: dict = None, options: dict = None) -> bytes:
    """
    Generates an image from text using a Vertex AI endpoint.
    """
    headers = {
        "Authorization": f"Bearer {access_token}",
        "Content-Type": "application/json"
    }
    payload = {
        "instances": [inputs],
        "parameters": parameters or {}
    }
    response = requests.post(api_url, headers=headers, data=json.dumps(payload))
    print("Status Code:", response.status_code)
    if response.status_code == 200:
        result = response.json()
        print("Result:", result)
        image_data_base64 = result['predictions'][0]
        image_data = base64.b64decode(image_data_base64)
        if isinstance(image_data, bytes) and len(image_data) > 0:
            return image_data
        else:
            raise InferenceOutputError("Expected bytes (image data)")
    else:
        print("Error:", response.status_code, response.text)
        raise InferenceOutputError(f"HTTP Error: {response.status_code}")

def load_config():
    """Loads configuration from the config file."""
    config.read(CONFIG_FILE)
    if "settings" in config:
        prompt_text.insert(tk.END, config.get("settings", "prompt", fallback=""))
        negative_prompt_text.insert(tk.END, config.get("settings", "negative_prompt", fallback=""))
        inference_steps_slider.set(config.getint("settings", "inference_steps", fallback=30))
        guidance_scale_slider.set(config.getint("settings", "guidance_scale", fallback=8))
        resolution_var.set(config.get("settings", "resolution", fallback="512x512"))
    else:
        # Set default values if the config file does not have settings
        inference_steps_slider.set(30)
        guidance_scale_slider.set(5)
        resolution_var.set("512x512")

def save_config():
    """Saves the current input values to the config file."""
    config["settings"] = {
        "prompt": prompt_text.get("1.0", tk.END).strip(),
        "negative_prompt": negative_prompt_text.get("1.0", tk.END).strip(),
        "inference_steps": inference_steps_slider.get(),
        "guidance_scale": guidance_scale_slider.get(),
        "resolution": resolution_var.get(),
    }
    with open(CONFIG_FILE, "w") as configfile:
        config.write(configfile)
    messagebox.showinfo("Info", "Configuration saved!")

def generate_image():
    global access_token

    inputs = prompt_text.get("1.0", tk.END).strip()
    negative_prompt = negative_prompt_text.get("1.0", tk.END).strip()
    try:
        num_inference_steps = int(inference_steps_slider.get())
        guidance_scale = int(guidance_scale_slider.get())
    except ValueError:
        messagebox.showerror("Error", "Inference steps and guidance scale must be integers.")
        return

    resolution = resolution_var.get()
    width, height = map(int, resolution.split("x"))

    parameters = {
        "negative_prompt": negative_prompt,
        "num_inference_steps": num_inference_steps,
        "guidance_scale": guidance_scale,
        "width": width,
        "height": height,
        "seed": 12345
    }

    try:
        image_bytes = text_to_image(api_url, access_token, inputs, parameters)

        # --- Save Image ---
        timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
        image_filename = f"image_{timestamp}.png"
        image_path = os.path.join(IMG_DIR, image_filename)

        image = Image.open(BytesIO(image_bytes))
        image.save(image_path)
        print(f"Image saved to: {image_path}")

        # --- Display Image ---
        image.show()

    except InferenceOutputError as e:
        messagebox.showerror("Error", f"Inference Error: {e.message}")
    except Exception as e:
        messagebox.showerror("Error", f"An unexpected error occurred: {e}")

# --- GUI Setup ---
window = tk.Tk()
window.title("Stable Diffusion Image Generator")

# --- Credentials ---
# Use Application Default Credentials (ADC)
credentials, project = default()
credentials.refresh(Request())
access_token = credentials.token

# --- API URL ---
# Replace with your endpoint URL
api_url = "REPLACE_WITH_YOUR_ENDPOINT_URL"

# --- Prompt ---
prompt_label = ttk.Label(window, text="Prompt:")
prompt_label.grid(column=0, row=0, sticky=tk.W, padx=5, pady=5)
prompt_text = scrolledtext.ScrolledText(window, wrap=tk.WORD, height=5)
prompt_text.grid(column=0, row=1, padx=5, pady=5, columnspan=2, sticky="nsew")

# --- Negative Prompt ---
negative_prompt_label = ttk.Label(window, text="Negative Prompt:")
negative_prompt_label.grid(column=0, row=2, sticky=tk.W, padx=5, pady=5)
negative_prompt_text = scrolledtext.ScrolledText(window, wrap=tk.WORD, height=5)
negative_prompt_text.grid(column=0, row=3, padx=5, pady=5, columnspan=2, sticky="nsew")

# --- Inference Steps ---
inference_steps_label = ttk.Label(window, text="Inference Steps (0-200):")
inference_steps_label.grid(column=0, row=4, sticky=tk.W, padx=5, pady=5)
inference_steps_slider = Scale(window, from_=0, to=200, orient=HORIZONTAL)
inference_steps_slider.set(30) # Default value
inference_steps_slider.grid(column=1, row=4, padx=5, pady=5, sticky="ew")

# --- Resolution ---
resolution_label = ttk.Label(window, text="Resolution:")
resolution_label.grid(column=0, row=5, sticky=tk.W, padx=5, pady=5)
resolution_var = tk.StringVar(window)
resolutions = ["512x512", "768x512", "512x768", "1024x768", "768x1024", "1216x832", "832x1216", "1024x1024"]
resolution_var.set(resolutions[0]) # Default value
resolution_dropdown = ttk.Combobox(window, textvariable=resolution_var, values=resolutions)
resolution_dropdown.grid(column=1, row=5, padx=5, pady=5, sticky="ew")

# --- Guidance Scale ---
guidance_scale_label = ttk.Label(window, text="Guidance Scale (0-20):")
guidance_scale_label.grid(column=0, row=6, sticky=tk.W, padx=5, pady=5)
guidance_scale_slider = Scale(window, from_=0, to=20, orient=HORIZONTAL)
guidance_scale_slider.set(5) # Default value
guidance_scale_slider.grid(column=1, row=6, padx=5, pady=5, sticky="ew")

# --- Load Configuration ---
load_button = ttk.Button(window, text="Load Configuration", command=load_config)
load_button.grid(column=0, row=8, padx=5, pady=10)

# --- Save Configuration ---
save_button = ttk.Button(window, text="Save Configuration", command=save_config)
save_button.grid(column=1, row=8, padx=5, pady=10)

# --- Generate Button ---
generate_button = ttk.Button(window, text="Generate Image", command=generate_image)
generate_button.grid(column=0, row=7, padx=5, pady=10, columnspan=2)

# --- Configure grid weights ---
window.columnconfigure(0, weight=1)
window.columnconfigure(1, weight=1)
for i in range(9):
    window.rowconfigure(i, weight=1)

# --- Load initial configuration ---
load_config()

# --- Run the GUI ---
window.mainloop()

@John6666 please know that I am deeply grateful for your help getting this to run!!

Since I was not able to find any resources on this secific set of issues I hope it is okay to do some search engine optimization:
Vertex AI Stable Diffusion XL inference error, Python Vertex AI endpoint huggingface issues, SDXL Vertex AI deployment troubleshooting, Stable Diffusion XL prompt formatting Vertex AI, Vertex AI image generation errors, Python requests to Vertex AI endpoint, Base64 decoding error Vertex AI, Vertex AI prediction response format, Stable Diffusion XL custom container handling, Authentication error Vertex AI, HTTP 400 error Vertex AI, TypeError: string indices must be integers Vertex AI, “can only concatenate tuple (not “dict”) to tuple” Vertex AI error, Vertex AI prediction timeout, Vertex AI SDK Stable Diffusion XL example, google-cloud-aiplatform Stable Diffusion, diffusers on Vertex AI, Stable Diffusion XL inference request Python, Vertex AI endpoint input format, Vertex AI endpoint output format, Troubleshooting Vertex AI online prediction, Vertex AI custom model input validation, Vertex AI pre-built container Stable Diffusion, Vertex AI prediction request payload, Vertex AI Python client library, GUI for stable diffusion model Vertex AI, Vertex AI Stable Diffusion XL tutorial, Vertex AI Stable Diffusion XL guide, Configuring Vertex AI endpoint for Stable Diffusion XL, Vertex AI model expects string or list, Vertex AI model expects dictionary, InferenceOutputError Vertex AI, TypeError Vertex AI, HTTP Error Vertex AI, image generation from Vertex AI returning static

dakazze · January 9, 2025, 1:55pm

!!!

using all the same inputs!
guess I`ll have to find a NSFW model that works out of the box like this!

Thank you very much for the help!

Topic		Replies	Views
Deploying 🤗 ViT on Vertex AI Intermediate	1	890	September 25, 2023
Deploying PyTorch ViT to Vertex AI using model artifacts 🤗Transformers	0	333	December 29, 2022
Having trouble deploying a Hugging Face model on GCP Vertex AI Beginners	1	55	April 5, 2025
Alot of questions, or, How can i run models locally (for an absolute begginger) Beginners	3	54	July 4, 2025
Unable to upload custom Pytorch model in huggingface 🤗Tokenizers	0	374	April 4, 2023

Documents

Inference Endpoint

Additional parameters

Related topics