[text classification] different result format for inference API and inference endpoint

ryansingman · October 25, 2023, 1:28am

Hi all!

I am trying to deploy an inference endpoint for the DeBERTa Base MNLI model. While the endpoint deploys successfully, I am finding that the result format is different the free inference API as opposed to querying the inference endpoint. See the below results:

Inference endpoint:

>>> requests.post("<MY INFERENCE ENDPOINT>", headers={"Authorization": f"Bearer <HF API KEY>"}, json={"inputs": "[CLS] I love you. [SEP] I like you. [SEP]"}).json()
[{'label': 'ENTAILMENT', 'score': 0.9248048663139343}]

Inference API:

>>> requests.post("https://api-inference.huggingface.co/models/microsoft/deberta-base-mnli", headers={"Authorization": f"Bearer <HF API KEY>"}, json={"inputs": "[CLS] I love you. [SEP] I like you. [SEP]"}).json()
[[{'label': 'ENTAILMENT', 'score': 0.9248047471046448}, {'label': 'NEUTRAL', 'score': 0.07485755532979965}, {'label': 'CONTRADICTION', 'score': 0.00033764145337045193}]]

Note that through the inference API, I receive ENTAILMENT, NEUTRAL, and CONTRADICTION scores, where as with the inference endpoint, I only receive ENTAILMENT. For my application, I need all three scores.

It’s not immediately clear to me what’s going on here. When creating the inference endpoint, I did so by clicking the deploy dropdown on the deberta-base-mnli page, so I would expect for the deployed model to yield the same result format. I am able to replicate this when creating both CPU and GPU inference endpoints for the deberta-base-mnli model and when creating an inference endpoint for the deberta-large-mnli model.

Thanks in advance for the help!

Topic		Replies	Views
Is only inference provider :HF Inference API >> permit API Call succefully for any model with fixed URL pattern <f"https://api-inference.huggingface.co/models/{repo_id}"> Beginners	2	16	July 16, 2025
Inference API detailed request Beginners	5	2277	September 11, 2020
Example Inference API (model & code ), pls Beginners	5	39	June 28, 2025
Inference Api ( serverless ) Endpoint Inference Endpoints on the Hub	0	456	April 24, 2024
My inference endpoint went from 1 second to 20-30 seconds, even example Beginners	2	33	February 25, 2025

[text classification] different result format for inference API and inference endpoint

Related topics