Autoscaling is turned on to min replicas as 0. Yet costing money?

I have turned on the inactivity after 15 mins. But still it was costing me money when there were no incoming traffic.

Hello, Is there any way to raise request as a support defect one? As per HF documentations, we should not be charged if the inference endpoint is idle for more than 15 min. Is this not applicable for CPU and GPU based endpoint? Thanks

Hi @Kishal, Thanks for reaching out and sorry to hear about this issue. Could you please send us an email to: ? We’ll need the name of the endpoint in question and any details if possible. We’ll take a look to see what happened. Thanks again!