I have tried to deploy many of the models on the LLM leaderboard (Open LLM Leaderboard - a Hugging Face Space by HuggingFaceH4) onto a space as a Gradio app. The deploy looks ok, but all the models I have tried don’t return a response, they just time out.
I have a paid account, and am selecting an A10G for the hardware.
I am sure I am making a rookie mistake, any help would be much appreciated.
I’ve tried to deploy a 7B model (Ejafa/vicuna_7B_vanilla_1.1 · Hugging Face) onto T4 medium hardware, and it won’t produce response to even simple prompts. Stuck on “processing” for minutes.
Code I used to deploy is as suggested by HF interface:
import gradio as gr
gr.Interface.load("models/Ejafa/vicuna_7B_vanilla_1.1").launch()