Just signed up with HF and had some questions for the general community to help us get started:
We plan to use the Cerebras Inference Provider using direct calls rather than routing through HF itself.
-
Can we use the same API key obtained from Cerebras as the Custom API key for HF?
-
With a Pro subscription, are there any limits to token usage or queuing constraints when using a custom API key and direct calls? The free tier on Cerebras did have such constraints.
-
Anyone here use the meta-llama/Llama-3.3-70B-Instruct model with Cerebras as the Inference Provider that can share feedback in terms of performance/deterministic behavior etc?
Thanks in advance