In the introduction about llama, it says that llama2 was trained on bfloat16. The model weight uploaded to Hugging Face is on float16. I’m wondering why the current work based on llama2 loads it on float16 other than bfloat16.
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Loading in Float32 vs Float16 has very different speed | 1 | 123 | February 20, 2025 | |
Why is llama 2 model Size after Lora finetune is too large? | 1 | 301 | December 1, 2023 | |
GPTQ model to bfloat16 | 0 | 431 | January 10, 2024 | |
How to Load Llama-3.3-70B-Instruct Model in Float8 Precision? | 1 | 294 | December 11, 2024 | |
How to properly UPCAST the model weights to float32? | 2 | 467 | April 11, 2024 |