Cannot run large models using API token

pdeubel · March 8, 2023, 11:35am

I am also unable to run a large model on the inference API, specifically Salesforce/codegen-16B-mono. Neither trying it out using the widget on the website nor using a REST request through python works. In both cases I get a time out, for example the widget gives the following output after some time: Model Salesforce/codegen-16B-mono time out.

Is that because the model is too big or because something in the backend is broken for that model. For the latter case, should I ask the model’s authors for help?

Topic		Replies	Views
Inference service for large models, such as Vicuna 13b Beginners	0	1427	May 5, 2023
PRO Plan and for running huge models on free inference api? Beginners	1	1806	May 15, 2023
Inference API stopped working for my model 🤗Hub	11	5379	April 26, 2023
The model mistralai/Mistral-7B-Instruct-v0.1 is too large to be loaded automatically (14GB > 10GB) Models	2	187	April 15, 2025
Inference API stopped working Inference Endpoints on the Hub	50	4649	June 8, 2025

Cannot run large models using API token

Related topics