In the introduction about llama, it says that llama2 was trained on bfloat16. The model weight uploaded to Hugging Face is on float16. I’m wondering why the current work based on llama2 loads it on float16 other than bfloat16.
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
bf16=True in TrainingArgument | 0 | 1076 | July 2, 2023 | |
Why is llama 2 model Size after Lora finetune is too large? | 1 | 302 | December 1, 2023 | |
Mixed precision for bfloat16-pretrained models | 2 | 12525 | April 21, 2021 | |
GPTQ model to bfloat16 | 0 | 440 | January 10, 2024 | |
Loading in Float32 vs Float16 has very different speed | 1 | 188 | February 20, 2025 |