Inference endpoint data privacy

Is data sent to and received back from the hugging face inference API thoroughly private (i.e. LLM prompts and completions)? Is it safe to send proprietary and confidential information in prompts?

This, I know, is not the case, for example, with the openai API.

Any insight would be helpful. Thanks!

Here are the security & compliance information for Hugging Face Inference Endpoints (not the free public inference API)

1 Like

Thanks philshmid! Seems pretty clear. The following makes me think Hugging Face is storing access logs for 30 days (like request history) but not the actual prompt tokens. Would you agree?:

“Hugging Face does not store any customer data in terms of payloads or tokens that are passed to the Inference Endpoint. We are storing logs for 30 days.”

In inference Endpoints, we are not storing any payload.
But as a user, you can create custom handlers, and if you print something in there it is stored for 30 days since its logged.

To be clear again this only counts for the Inference Endpoints product and not the Inference API

That makes sense, thanks. I’d probably just use the paid inference endpoint. I see that there’s a range of prices, depending upon instance specs. My needs will be medium range, for inference on one of the new Open Assistant LLM models. I’m assuming that my costs will only be charged for active inference time and that I don’t need to pay to keep a compute instance live or in “listening state” when I’m not sending requests? In other words: Is usage and access on demand?

Ok. It seems my assumption was wrong. From what I can tell, using a paid inference endpoint involves paying to instantiate a compute instance and keep it alive, incessantly. I was hoping for something more “on demand,” but it makes sense, I suppose. I think the inference API is a better fit for the early stages of my project.

Phil, (or someone else who knows) would you mind explaining a little more about how and if payloads and tokens are stored and accessed on the inference API? My goal is to interact with my chosen model, regarding proprietary projects, and privacy and security are, thus, important. I can’t find this information in the documentation. Also, what are usage/rate limits?

Sorry for all the questions, and thanks in advance for your support!