Currently I’m trying to get the LM evaluation harness running without success. I was curious if there is an easy way to Benchmark or evaluate pre-trained Generative text models inside the hugging face Library. I’m sorry if this is really obvious.
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
New to Huggingface | 0 | 423 | June 10, 2023 | |
ModelClash: Dynamic LLM Evaluation Through AI Duels | 0 | 29 | July 22, 2024 | |
Retrieval Augmented Generation using Transformer Eco System | 0 | 435 | October 12, 2023 | |
Choosing Benchmarks for Fine-Tuned Models in Emotion Analysis | 0 | 30 | November 23, 2024 | |
Accessibility of Huggingface's OpenLLMLeaderboard Benchmark Test Sets | 3 | 35 | July 23, 2024 |