I’m attempting to use the Hugging Face Inference API to interact with the google/gemma-3-27b-it
model via the following cURL request, as outlined in the documentation:
curl https://router.huggingface.co/hf-inference/models/google/gemma-3-27b-it/v1/chat/completions
-H ‘Authorization: Bearer hf_xxxxxxxxxxxxxxxxxxxxxxxx’
-H ‘Content-Type: application/json’
-d ‘{
“messages”: [
{
“role”: “user”,
“content”: [
{
“type”: “text”,
“text”: “Describe this image in one sentence.”
},
{
“type”: “image_url”,
“image_url”: {
“url”: “https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg”
}
}
]
}
],
“max_tokens”: 512,
“model”: “google/gemma-3-27b-it”,
“stream”: false
}’
Despite setting the correct bearer token and headers, the API returns the following response:
{
“error”: "Not Found: "
}
- Could you please confirm if this is the correct endpoint for interacting with the
gemma-3-27b-it
model via Hugging Face Inference API? - Does this model (
google/gemma-3-27b-it
) currently support multimodal input (e.g., image + text) in the chat/completions format? - If not, is there another endpoint or model that supports such multimodal interactions with similar capabilities?
Additional Info
- The API key is valid and has been tested with other models successfully.
- The request format is modeled after Hugging Face documentation for multimodal chat-based inference.
Any guidance on the proper usage or endpoint for this model would be greatly appreciated.
Let me know if you want me to draft an email or GitHub issue with this wording too.