Inference endpoint data privacy

asresearch7428 · April 17, 2023, 2:11am

Is data sent to and received back from the hugging face inference API thoroughly private (i.e. LLM prompts and completions)? Is it safe to send proprietary and confidential information in prompts?

This, I know, is not the case, for example, with the openai API.

Any insight would be helpful. Thanks!

philschmid · April 17, 2023, 7:09am

Here are the security & compliance information for Hugging Face Inference Endpoints (not the free public inference API)

asresearch7428 · April 17, 2023, 11:02am

Thanks philshmid! Seems pretty clear. The following makes me think Hugging Face is storing access logs for 30 days (like request history) but not the actual prompt tokens. Would you agree?:

“Hugging Face does not store any customer data in terms of payloads or tokens that are passed to the Inference Endpoint. We are storing logs for 30 days.”

philschmid · April 17, 2023, 11:40am

In inference Endpoints, we are not storing any payload.
But as a user, you can create custom handlers, and if you print something in there it is stored for 30 days since its logged.

To be clear again this only counts for the Inference Endpoints product and not the Inference API

asresearch7428 · April 17, 2023, 12:43pm

That makes sense, thanks. I’d probably just use the paid inference endpoint. I see that there’s a range of prices, depending upon instance specs. My needs will be medium range, for inference on one of the new Open Assistant LLM models. I’m assuming that my costs will only be charged for active inference time and that I don’t need to pay to keep a compute instance live or in “listening state” when I’m not sending requests? In other words: Is usage and access on demand?

asresearch7428 · April 17, 2023, 4:08pm

Ok. It seems my assumption was wrong. From what I can tell, using a paid inference endpoint involves paying to instantiate a compute instance and keep it alive, incessantly. I was hoping for something more “on demand,” but it makes sense, I suppose. I think the inference API is a better fit for the early stages of my project.

Phil, (or someone else who knows) would you mind explaining a little more about how and if payloads and tokens are stored and accessed on the inference API? My goal is to interact with my chosen model, regarding proprietary projects, and privacy and security are, thus, important. I can’t find this information in the documentation. Also, what are usage/rate limits?

Sorry for all the questions, and thanks in advance for your support!

Topic		Replies	Views
Sensitive data privacy / gathering Spaces	2	393	April 7, 2025
Inference Endpoints / Model choices / Help Inference Endpoints on the Hub	1	19	July 10, 2025
Hugging face inference support and quota Inference Endpoints on the Hub	3	115	March 7, 2025
How to check if a model is free to use via Hugging Face Inference API? Models	1	66	June 19, 2025
Paid API Service Beginners	6	1529	January 6, 2023

Inference endpoint data privacy

Related topics