How can I get the logits from an endpoint call?

mfixman · August 26, 2024, 1:19pm

I’m attempting to do a query similar to this one using the Huggingface inference endpoints.

api_url = 'https://api-inference.huggingface.co/models/meta-llama/Meta-Llama-3.1-70B-Instruct'
headers = {'Authorization': f'Bearer {token}'}
response = requests.post(api_url, headers = headers, json = {'inputs': 'What is the capital of France? The capital of France is : ')

I’m not just looking for the answer, but also for the logits of the generated search: I want to be able to calculate the probability of getting a certain answer.

I can do this with AutoModelForCausalInference, but most big models don’t fit my GPUs (and a HF Pro subscription is cheaper than another A100).

Is there any way to use the API this way?

nielsr · August 26, 2024, 1:33pm

Hi,

According to Detailed parameters, the serverless inference API does not support returning logits.

If you want that, you could define a custom handler on Inference Endpoints which also returns logits besides text.

mfixman · August 26, 2024, 1:56pm

Just checking: can I use the API calls in inference endpoints with logits as part of an experiment I’m doing in my local computer, or do I have to use gradio, Spaces, or some other library like that?

kerrmetric · August 30, 2024, 6:03am

+1 I’d be interested in an easy/standard way to do this as well.

Topic		Replies	Views
How do I get logits from an Inference API Wav2Vec2 model? Inference Endpoints on the Hub	1	59	August 6, 2024
Get logits from Inference API Classification Model (for Regression) 🤗Hub	2	1092	January 24, 2024
Can I get logits for each sequence I acqired from model.generate()? Beginners	1	1318	November 27, 2020
Inference Model with API and Integrate to LM (Language Model) 🤗Transformers	0	645	June 7, 2022
Inference API detailed request Beginners	5	2306	September 11, 2020

How can I get the logits from an endpoint call?

Related topics