When doing model.predict() on huggingface Llama model, I get loss of 1.16, which is greater than 1. If the language model loss is cross entropy, how can it be greater than 1?
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Ideal loss and training values? | 0 | 81 | February 10, 2025 | |
LLM training loss fluctuation | 0 | 932 | November 30, 2023 | |
What is `self.loss_function` in `forward()` of newly released LLM? | 0 | 32 | January 14, 2025 | |
Fine-tuning LLM for regression yields low loss during training but not in inference? | 2 | 4277 | March 4, 2024 | |
Llama2-7b-hf model not reproducible across runs | 1 | 501 | March 15, 2024 |