Inference endpoint data privacy

Ok. It seems my assumption was wrong. From what I can tell, using a paid inference endpoint involves paying to instantiate a compute instance and keep it alive, incessantly. I was hoping for something more “on demand,” but it makes sense, I suppose. I think the inference API is a better fit for the early stages of my project.

Phil, (or someone else who knows) would you mind explaining a little more about how and if payloads and tokens are stored and accessed on the inference API? My goal is to interact with my chosen model, regarding proprietary projects, and privacy and security are, thus, important. I can’t find this information in the documentation. Also, what are usage/rate limits?

Sorry for all the questions, and thanks in advance for your support!

1 Like