Inference Endpoints - No working code examples

CylinderS · January 28, 2025, 3:45am

Well, I just realized in my case I’m running a GGUF in a Llama.cpp container, and maybe the OpenAI endpoint is the only one available in that container? The docs say this:

" You can deploy any llama.cpp compatible GGUF on the Hugging Face Endpoints. When you create an endpoint with a GGUF model, a llama.cpp container is automatically selected using the latest image built from the master branch of the llama.cpp repository. Upon successful deployment, a server with an OpenAI-compatible endpoint becomes available."

Topic		Replies	Views
HF Inference Endpoints don't finish Initializing Inference Endpoints on the Hub	0	240	March 28, 2024
Unable to get inference results after deploying model to Inferende Endpoints Inference Endpoints on the Hub	0	14	May 8, 2025
Dumb Question: Seeing that my inference API links not working Beginners	1	72	July 10, 2025
Inference Endpoints / Model choices / Help Inference Endpoints on the Hub	1	30	July 10, 2025
Dedicated inference endpoint failing to initialize Inference Endpoints on the Hub	0	42	November 21, 2024

Inference Endpoints - No working code examples

Related topics