Serverless Llama 3.1 70b it, quantized or not?

Amir313n · November 13, 2024, 1:08pm

Hi.
I want to buy a HF pro plan to use llama-3.1-70b-instruct serverless, but I wonder if this model is served with quantization or not. I want more information about the models available serverless.
Another question. Can I use llama-3.1-405b-instruct(fp8) with serverless API?

Topic		Replies	Views
Meta-llama / Meta-Llama-3-70B-Instruct is not available as a serverless API Models	10	1615	September 28, 2024
To use Llama3.1-405b do I have to rent a server, or can I send my API requests to someone else's server and pay them thrrough HF? Beginners	1	101	August 30, 2024
Newbie: best server for finetuning Llama Beginners	0	580	April 12, 2023
Is the serverless API completely broken and unreliable? Inference Endpoints on the Hub	5	130	January 6, 2025
Can i create endpoint using quantized model? Inference Endpoints on the Hub	3	721	January 16, 2024

Serverless Llama 3.1 70b it, quantized or not?

Related topics