I have tried to deploy many of the models on the LLM leaderboard (Open LLM Leaderboard - a Hugging Face Space by HuggingFaceH4) onto a space as a Gradio app. The deploy looks ok, but all the models I have tried don’t return a response, they just time out.
I have a paid account, and am selecting an A10G for the hardware.
I am sure I am making a rookie mistake, any help would be much appreciated.
Bump, because I’m having the exact same problem.
I’ve tried to deploy a 7B model (Ejafa/vicuna_7B_vanilla_1.1 · Hugging Face) onto T4 medium hardware, and it won’t produce response to even simple prompts. Stuck on “processing” for minutes.
Code I used to deploy is as suggested by HF interface:
import gradio as gr
I gave up trying to deploy on HF in the end, and switched to Runpod. This tutorial was my starting point.