Fetching all parameters from the checkpoint at /xx/xxx/llama/70B. Killed

liushaodong · July 22, 2023, 6:36am

When I use the command “python convert_llama_weights_to_hf.py --input_dir /home/hadoop-kg-llm-ddpt/llama/ --model_size 70B --output_dir /home/hadoop-kg-llm-ddpt/llama/Llama-2-70b-chat-hf” to convert llama2, an error occurred:

Also, I noticed
Important note: you need to be able to host the whole model in RAM to execute this script (even if the biggest versions
come in several checkpoints they each contain a part of each weight of the model, so we need to load them all in RAM).

However, I still want to ask, when the memory is not enough, is there any parameter control that can use the disk instead of the memory to complete the task?

TomerGalanti · August 31, 2023, 2:24am

Did you find a solution by any chance? I ran into the same problem

Topic		Replies	Views
Load_checkpoint_and_dispatch without heavy system memory usage 🤗Accelerate	1	3071	April 10, 2023
Loadig the LLAMA 30B Model. Memory Issue Models	2	2163	July 27, 2023
Model Size Mismatch Beginners	0	313	May 11, 2024
Unable to run quantized Llama2 70b model Models	2	93	December 30, 2024
meta-llama/Llama-2-7b-chat-hf weird responses, compared to the ones returned by the HF API 🤗Transformers	1	113	February 2, 2025

Fetching all parameters from the checkpoint at /xx/xxx/llama/70B. Killed

Related topics