notebook link.
RuntimeError: unable to mmap 9976576152 bytes from file </home/devuser/.cache/huggingface/hub/models–meta-llama–Llama-2-7b-chat-hf/snapshots/c1b0db933684edbfe29a06fa47eb19cc48025e93/model-00001-of-00002.safetensors>: Cannot allocate memory (12)
1 Like
Were you able to solve it? I am facing the same issue
Yep, I performed the merge adapter part in a separate file and I loaded the model with low_cpu usage.
1 Like
Thanks!
But wouldn’t this offload the entire model on the CPU, even if you have a GPU?
I don’t think so. Model offload scenarios can be handled by using the ‘device_map’ argument.