Need Suggestions for LLM Models Suitable for 250GB RAM Server

hyadav22 · December 29, 2024, 11:28am

Hi All,

I have a server with 250GB of RAM but no GPU. I’ve attempted to run some quantized LLAMA3 models, such as:

unsloth: Llama-3.3-70B-Instruct-Q5_K_M.gguf, Llama-3.3-70B-Instruct-Q3_K_M.gguf

However, I’ve been unable to load them due to RAM limitations.

I’m looking for an LLM that can run within my 250GB RAM setup. My tasks involve basic question-answering, such as analyzing a call transcript (in JSON format) between an agent and a customer to determine:

Whether the agent introduced themselves,
Whether the agent resolved the issue, or
If the issue was escalated to a supervisor.

Could you please suggest any suitable models for these requirements?

Thanks in advance!

Topic		Replies	Views
Unable to run quantized Llama2 70b model Models	2	93	December 30, 2024
Find LLM to run on single gpu with only 8 GB ram Models	10	7696	March 22, 2024
Memory Requirements for Running LLM Beginners	2	7270	May 8, 2024
How to run large LLMs like Llama 3.1 70B or Mixtral 8x22B with limited GPU VRAM? Beginners	2	1654	September 26, 2024
System Requirement for a given model to run it on local Beginners	0	3938	September 7, 2023

Need Suggestions for LLM Models Suitable for 250GB RAM Server

Related topics