Exceeded GPU quota

I need to ask my users to signin with “their” hf account in the application?

Yes. That method is probably the basis for HF’s assumptions. I don’t know if it is possible to dare to sign in with your Pro account and increase the Quota for all users, but that might be a way to do it. Anyway, everything about the Pro account is vague. This is not only my lack of understanding, but the explanation does not exist anyway.

Also how would the pro plan be of any use if hf can’t even detect my activity?

From the standpoint of simply using Spaces, it is of little use. I heard somewhere that Quota is 5 times higher when you are signed in with Pro, but the overwhelming majority of spaces do not have a sign in button itself…

If you are in the position of creating Spaces, the biggest advantage is that you can use up to 10 Zero GPU spaces at the same time. If this can be ingeniously modified and successfully used via other cloud services, it can be made much cheaper financially.

In the Serverless Inference API, Pro tokens increase the number of models that can be used in addition to the number of requests that can be made. (Llama3 70B, etc.)
However, the specific number of requests has been ambiguous for a long time now.

The same is true for the $20 Enterprise plan, but the benefits of the flat-rate plan are just plain vague. It is unclear whether it is necessary to be vague, or whether they simply don’t realize they are not explaining it well enough.:sweat: