Hi everyone !
I managed to deploy a model and linked it to my infrastructure. During tests, the model has been inferred with only 2 images. Response time was about 10-15s each but I have been billed for 1 minute 40 and more (had to pause the endpoint).
The documentation states exactly : "
Pay for compute resources uptime by the minute, billed monthly.
As low as $0.06 per CPU core/hr and $0.6 per GPU/hr.
which led me to assume that as long as the endpoint is not being used, I am not being billed. Am I wrong or did I miss something ?