When doing model.predict() on huggingface Llama model, I get loss of 1.16, which is greater than 1. If the language model loss is cross entropy, how can it be greater than 1?
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Ideal loss and training values? | 1 | 188 | May 20, 2025 | |
LLM training loss fluctuation | 0 | 949 | November 30, 2023 | |
What is `self.loss_function` in `forward()` of newly released LLM? | 0 | 48 | January 14, 2025 | |
Fine-tuning LLM for regression yields low loss during training but not in inference? | 2 | 4494 | March 4, 2024 | |
Llama2-7b-hf model not reproducible across runs | 1 | 509 | March 15, 2024 |