Just go here and see the runtime errors: evaluate-metric (Evaluate Metric)
How can this not get fixed? Huggingface is such a great company, it is a huge oversight. The library is completely unusable. Even “accuracy” fails.
I wish my sklearn metrics had report cards like these do, but the library is so unreliable I can’t use it. The datasets package documentation say that package is being deprecated but the recommended alternative doesn’t function for some time now…