Hello,
I designed a backend for an LLM Leaderboard. The issue is that while it runs on the same space with CPU basic and Nvidia T4 medium, it doesn’t work on CPU Upgrade. After the model loads, it gives a connection error during the inference phase. I don’t understand why it’s not working on a resource with more RAM and CPU when it works fine on a lower resource. It’s a very frustrating situation. Is there anyone who can help?
here is the space link: Backend - a Hugging Face Space by LLM-Beetle