I am a PRO user and I have a space running on the ZeroGPU. It is pretty neat, but after a few inferences I get an error that says Quota is Exceeded and some numbers like: Requested 42s and 60s…
3 questions please:
What is the quota and how is it calculated?
How do I interpret the numbers/seconds i.e “requested 42s on 60s”?
What is the path towards getting to a point with no quota limitations like this?
The quota on ZeroGPU for PRO users limits GPU usage to a specified time per session. ‘Requested 42s on 60s’ means your session requested 42 seconds of GPU time but was limited to 60 seconds.
I am still not sure I understand what the quota is based on your answer. Usually, when service providers talk about quota they publish things like: “60 requests per minute” or “3GB per hour”…etc. What is the quota of GPU time here?
And “Requested 42s on 60s” → if it means the session requested 42 seconds of GPU time and the limit was 60 seconds doesn’t it mean there is enough? The way I understand it is that there is 60 seconds available and my session requested 42sec?
Also, I noticed that when this error happens then the error also shows that I should try again in like 8 minutes and at times even 35 minutes or 40 minutes. So, I am very confused as to how this is calculating.