I’m currently looking for a very small LLM that cannot exceed 1.5GB in size. The goal is for it to handle simple Q&A tasks (nothing fact-based or overly complex, just basic interpretation of input). Additionally, the model needs to understand basic writing mistakes (typos, grammar issues) and be able to handle very primitive interactions.
I understand that larger models generally perform better, but I’m really constrained by size limits here. Does anyone have experience working with models of this size or know how to achieve something like this while retaining minimal functionality?
Any advice or guidance would be greatly appreciated!
If you are looking for a model that is 1GB in the quantized state, you can find a 1B model GGUF and you are done, but the number of models within 1GB in the float16 state is limited.
The following model meets the requirements, and the author has a lot to say about small models.
You can find the GGUF model of 1B at the link below.
Qwen 2.5 is very high performance in general. Especially the multilingual performance is much better than 2. Other 0.5B models can also be searched as follows.
By the way, the automatic generation of the link from the forum to the model search is slightly buggy.
One of the URLs below works fine, but I fixed it manually. The other one does not work. I don’t understand the logic.