Hello team, I’m uaing the inference API to create a simple website, I’m using the pro plan and I want to know how many minutes the API keeps the model in memory before off-loading it? For example, when I request the model the first time, the inference API puts the model on memory, then after how m…

Inference API offline model limit

chin-cyber May 2, 2024, 10:35am 2

it is stored permanently into your cache, i will suggest take it out and host it on your cloud instance

Topic		Replies	Views
Serverless Inference API Token Limits/Settings Beginners	2	197	November 26, 2024
Inference API timeout Site Feedback	0	188	May 29, 2024
Inference API detailed request Beginners	5	2297	September 11, 2020
Inference Api free rate limit Inference Endpoints on the Hub	0	1925	May 20, 2023
Inference API - Response of Higher Length Beginners	0	850	April 22, 2021