Price model to avoid limits

davide9 · May 5, 2025, 5:18pm

Hello. Im trying to get a RAG LLM to run on HF Spaces. It runs, but I get this message:

This is a demonstration of the document retrieval component of the RAG system.
Due to model access limitations in Hugging Face Spaces, only the document retrieval part is working.
In a full implementation, a language model would use these retrieved documents to generate a detailed answer.

I upgraded to Pro, but the message remained. Can anyone let me know what HF service I need to subscribe to that addresses this message ?

Thank in advance

Zelgodiz · May 5, 2025, 9:30pm

It looks like your RAG LLM is running on Hugging Face Spaces, but model access limitations are preventing the full implementation, even after upgrading to Pro. This likely compute resources are required beyond what the Pro plan provides.

To enable the language model component, upgrading to a GPU-backed instance may be necessary. Hugging Face offers hardware tiers ranging from ZeroGPU (included in Pro) to higher-end options like Nvidia T4, L4, or A100 GPUs.

Here’s what can be done:

Check the current Space configuration – Open Space settings and verify the hardware allocation.
Upgrade to a GPU-backed instance – If the Space is running on CPU, upgrading to a GPU option via the Hugging Face pricing page might resolve the issue.
Confirm model requirements – Some models need specific hardware configurations. If unsure, checking the model documentation or forums can clarify requirements.

If upgrading the hardware doesn’t solve the problem, sharing additional details like error messages or current settings might help pinpoint the issue. Let me know if any further guidance is needed!

John6666 · May 5, 2025, 9:43pm

Hmm… I don’t think the error message is from Hugging Face. Hugging Face is also showing an error, but…
Probably the library is not working as expected.

How about trying another service for the LLM part of RAG?

davide9 · May 7, 2025, 10:25pm

I am running Pro. Gonna build a new space and see what happens.

=== Load LLM (HF-hosted) ===

generator = pipeline(“text-generation”, model=“mistralai/Mistral-7B-Instruct-v0.1”, device=-1)

Topic		Replies	Views
Which solution is best suited in my case? Beginners	2	67	October 17, 2024
Can we upgrade this discussion forum search to use RAG? Beginners	0	166	March 4, 2024
How to use llm model's api? Beginners	2	2738	November 14, 2024
Issues Accessing Models - "Model Requires a Pro Subscription" Error Beginners	1	350	November 29, 2024
HF runtime error. Memory limit exceeded Spaces	0	206	May 23, 2024

Price model to avoid limits

=== Load LLM (HF-hosted) ===

Related topics