Cannot run large models using API token

Hi @mandelakori, Zephyr is an LLM developed by our team, for which we’ve manually enabled inference. For other large models, we currently recommend using Inference Endpoints.