Unable to get inference results after deploying model to Inferende Endpoints

jgchaparro · May 8, 2025, 2:19pm

When deploying a fine-tune of Gemma 2 9B converted to GGUF format and hosted in this repository using Inference Endpoints, although the server starts, it is unable to run.

When testing the endpoint directly on the UI, with the following inputs, set by default:

{“inputs”:“Hello world!”}

The response is

[object Object]

Calling the API from local Python returns a 401 File not found error.

Any ideas on what is going on here? Thanks in advance.

Topic		Replies	Views
Endpoint issue with GPTQ Inference Endpoints on the Hub	0	219	January 23, 2024
Deploying a GGUF model on Inference Endpoints: 404 File Not Found Inference Endpoints on the Hub	1	24	September 15, 2025
Model working on free api but not on paid one Inference Endpoints on the Hub	0	252	November 9, 2023
Inference Endpoints fail to start Beginners	1	1833	August 3, 2023
Model works in inference UI, but not on inference API Inference Endpoints on the Hub	0	565	August 9, 2023

Unable to get inference results after deploying model to Inferende Endpoints

Related topics