Hello. Im trying to get a RAG LLM to run on HF Spaces. It runs, but I get this message:
- This is a demonstration of the document retrieval component of the RAG system.
- Due to model access limitations in Hugging Face Spaces, only the document retrieval part is working.
- In a full implementation, a language model would use these retrieved documents to generate a detailed answer.
I upgraded to Pro, but the message remained. Can anyone let me know what HF service I need to subscribe to that addresses this message ?
Thank in advance
1 Like
It looks like your RAG LLM is running on Hugging Face Spaces, but model access limitations are preventing the full implementation, even after upgrading to Pro. This likely compute resources are required beyond what the Pro plan provides.
To enable the language model component, upgrading to a GPU-backed instance may be necessary. Hugging Face offers hardware tiers ranging from ZeroGPU (included in Pro) to higher-end options like Nvidia T4, L4, or A100 GPUs.
Here’s what can be done:
- Check the current Space configuration – Open Space settings and verify the hardware allocation.
- Upgrade to a GPU-backed instance – If the Space is running on CPU, upgrading to a GPU option via the Hugging Face pricing page might resolve the issue.
- Confirm model requirements – Some models need specific hardware configurations. If unsure, checking the model documentation or forums can clarify requirements.
If upgrading the hardware doesn’t solve the problem, sharing additional details like error messages or current settings might help pinpoint the issue. Let me know if any further guidance is needed!
1 Like
Hmm… I don’t think the error message is from Hugging Face. Hugging Face is also showing an error, but…
Probably the library is not working as expected.
How about trying another service for the LLM part of RAG?