Issues when trying to build llama.cpp

Hi,

Thanks for the reply.

I’m not sure I’m able to control the building of llama.cpp as I’m running a python script for the training and after the training, I did the saving and push to HF the built models:

.....
perform_training()

model.save_pretrained("./llama-2-7b-chat_fine_tuned")
tokenizer.save_pretrained("./llama-2-7b-chat_fine_tuned")

model.push_to_hub("jonACE/llama-2-7b-chat_fine_tuned", token=hf_token)
tokenizer.push_to_hub("jonACE/llama-2-7b-chat_fine_tuned", token=hf_token)


# save GGUF versions
model.save_pretrained_gguf("./llama-2-7b-chat_fine_tuned", tokenizer,)
model.push_to_hub_gguf("jonACE/llama-2-7b-chat_fine_tuned", tokenizer, token=hf_token)

model.save_pretrained_gguf("./llama-2-7b-chat_fine_tuned", tokenizer, quantization_method = "f16")
model.push_to_hub_gguf("jonACE/llama-2-7b-chat_fine_tuned", tokenizer, quantization_method = "f16", token=hf_token)

model.save_pretrained_gguf("./llama-2-7b-chat_fine_tuned", tokenizer, quantization_method = "q4_k_m")
model.push_to_hub_gguf("jonACE/llama-2-7b-chat_fine_tuned", tokenizer, quantization_method = "q4_k_m", token=hf_token)

The model object was a result of unsloth.FastLanguageModel:

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name=model_name,
    max_seq_length=2048
)

model = FastLanguageModel.get_peft_model(model)

Could it be that the
python ‘unsloth’ component is still based on older llama.cpp versions?

1 Like