Which Is the best small model (3b) for rag, I am building a rag and using mistral-nemo 12b for it, i have tested many other model but not getting expected output like mistral nemo providing, nemo exactly follow system prompt but i can’t find any 3b model which exactly follow my system prompt, its normal that nemo is 12b model so it works better than any 3b model, but in my case i don’t want my model to have a large knowledge base outside my domain (200 pdf’s), and i want it to be super fast …
i am currently using Ollama ,please suggest the best 2-4b model for rag , smaller is better
1 Like
If you want a model that is as versatile as possible in that size range, I recommend these models.
Among the 7 or 8B models, Ministral instruct 2410 GGUF is the best for me in french (IQ4 XS is small), so it’s probably also the best among the 3Bs.
For local PDF GPT4all is interesting, LocalDocs is efficient.
1 Like
I’m using granite3.2:8b for rag.granite3.2:2b is good as 2B.
but I’m not sure if the model can understand the system prompt provided by you.
1 Like
Gemma 3 has been released, and 4B and 1B are in the lineup.
This seems to fit this use case.