Converting LLaMa 2 bin files to safetensors changes the output

milad-a · October 18, 2023, 9:53pm

Hi,

I fine-tuned LLaMa 2 to test its query classification quality and after I saved my final model, I converted the bin files to safetensors and used it in TGI. But I noticed I am getting completely different results compared to just using AutoModelForCausalLM.from_pretrained() and model.generate(). Note that all other parameters like top_p, top_k, etc. are the same and temperature is set to a small positive value of 0.01. After a lot of testing, I am confident that safetensors are the only variable between the two.

Is this a known fact or a bug in conversion? I used the following code to convert:

Thanks,
Milad

Topic		Replies	Views
How convert the .bin files into .safetensors files? 🤗Hub	1	2480	October 12, 2023
.bin to safetensors without publishing it on hub? Beginners	7	8817	May 13, 2025
Loading a safetensors format model using Hugging Face Transformers 🤗Transformers	2	4723	September 13, 2023
How to Export a LLM as a .bin instead of Safetensors 🤗Transformers	0	936	December 1, 2023
Saving model in safetensors format through Trainer fails for Gemma 2 due to shared tensors 🤗Transformers	5	1316	September 30, 2024

Converting LLaMa 2 bin files to safetensors changes the output

Related topics