"num_images_per_prompt" in Inference Endpoint request with a text-to-image model

Lunatikz61 · July 22, 2024, 10:59pm

Hi community,

I’m using a dedicated Inference Endpoint with a text-to-image model (stable-diffusion-xl-base-1-0-cqq).

The documentation indicates, that I could use the parameter “num_images_per_prompt”. I experimentally set this parameter to 2 in my request, but the response base64 encoded string does not seem to contain two images.
How would I use this parameter correctly?

This is my curl request

curl "https://some_instance.eu-west-1.aws.endpoints.huggingface.cloud"   \
  -X POST \
  -H "Accept: application/json"  \
  -H "Authorization: Bearer hf_XYZ..." \
  -H "Content-Type: application/json" \
  -d '{ "inputs": "a cat sitting on a mirror", "parameters": { "num_images_per_prompt" : 2 }}' \ 
  > response.txt

In the response I’m searching actually for the Base64 header (iVBORw0KGgo) for png images… I can only find one.

Another question would be, can I use multiple prompts in one request ?

So something like:

curl "https://some_instance.eu-west-1.aws.endpoints.huggingface.cloud"  \
 -X POST  \
 -H "Accept: application/json" \
 -H "Authorization: Bearer hf_XYZ..." \
 -H "Content-Type: application/json"  \
 -d '{ "inputs": [ "a cat sitting on a mirror", "a dog sitting on a mirror" ]}'  \
 > response.txt

I tried this and it seems the model takes nearly the doubled time to perform the generation, thus assuming it did generate two images. But again in the response I do only find one image.

Topic		Replies	Views
Diffuser API Inference Community Limited to 1 Image Return Inference Endpoints on the Hub	0	480	April 8, 2023
Issue with Salesforce/blip-image-captioning-large Endpoint: "input_ids or inputs_embeds" Error Inference Endpoints on the Hub	1	467	December 12, 2023
How to have custom output size for inference API Inference Endpoints on the Hub	4	1313	February 16, 2023
LLaVA multi-image input support for inference Models	8	7361	August 30, 2024
Image to Text API Inference - Input Error Inference Endpoints on the Hub	0	451	October 30, 2023

"num_images_per_prompt" in Inference Endpoint request with a text-to-image model

Related topics