I’m a complete beginner (just signed to HF and trying to learn how it can be used).
I’d like to use the Llama3.1-405b model via API requests, and wonder if HF allows people who already own servers to let other users access their servers through API requests. Is that possible?
If yes, where on HF’s site can I find such servers?
It seems HF doesn’t allow the 405b model to be used by inference API serverless, something about it taking 900GB whereas the limit for serverless is 10GB.
Thank you.