Hi.
I want to buy a HF pro plan to use llama-3.1-70b-instruct serverless, but I wonder if this model is served with quantization or not. I want more information about the models available serverless.
Another question. Can I use llama-3.1-405b-instruct(fp8) with serverless API?
1 Like