Unable to get inference results after deploying model to Inferende Endpoints

When deploying a fine-tune of Gemma 2 9B converted to GGUF format and hosted in this repository using Inference Endpoints, although the server starts, it is unable to run.

When testing the endpoint directly on the UI, with the following inputs, set by default:

{“inputs”:“Hello world!”}

The response is

[object Object]

Calling the API from local Python returns a 401 File not found error.

Any ideas on what is going on here? Thanks in advance.

1 Like