Hi I got a little bit confused when using Evaluator , for example I using dataset dataset indonluat their “smsa” data their label is {"positive": 0, "neutral": 1,"negative": 2}
but when I write it on evaluator it got an error of
KeyError Traceback (most recent call last)
Cell In [5], line 1
----> 1 eval_results = task_evaluator.compute(
2 model_or_pipeline="mdhugol/indonesia-bert-sentiment-classification",
3 data=data,
4 label_mapping={"positive": 0, "neutral": 1,"negative": 2}
5 )
File c:\users\w i n d o w s\pycharmprojects\pythonproject\venv\lib\site-packages\evaluate\evaluator\text_classification.py:105, in TextClassificationEvaluator.compute(self, *args, **kwargs)
41 def compute(self, *args, **kwargs) -> Tuple[Dict[str, float], Any]:
42 """
43 Compute the metric for a given pipeline and dataset combination.
44 Args:
(...)
102 >>> )
103 ```"""
--> 105 result = super().compute(*args, **kwargs)
107 return result
File c:\users\w i n d o w s\pycharmprojects\pythonproject\venv\lib\site-packages\evaluate\evaluator\base.py:200, in Evaluator.compute(self, model_or_pipeline, data, metric, tokenizer, feature_extractor, strategy, confidence_level, n_resamples, device, random_state, input_column, label_column, label_mapping)
198 # Compute predictions
199 predictions, perf_results = self.call_pipeline(pipe, pipe_inputs)
--> 200 predictions = self.predictions_processor(predictions, label_mapping)
202 metric_inputs.update(predictions)
204 # Compute metrics from references and predictions
File c:\users\w i n d o w s\pycharmprojects\pythonproject\venv\lib\site-packages\evaluate\evaluator\text_classification.py:35, in TextClassificationEvaluator.predictions_processor(self, predictions, label_mapping)
34 def predictions_processor(self, predictions, label_mapping):
---> 35 predictions = [
36 label_mapping[element["label"]] if label_mapping is not None else element["label"]
37 for element in predictions
38 ]
39 return {"predictions": predictions}
File c:\users\w i n d o w s\pycharmprojects\pythonproject\venv\lib\site-packages\evaluate\evaluator\text_classification.py:36, in <listcomp>(.0)
34 def predictions_processor(self, predictions, label_mapping):
35 predictions = [
---> 36 label_mapping[element["label"]] if label_mapping is not None else element["label"]
37 for element in predictions
38 ]
39 return {"predictions": predictions}
KeyError: 'LABEL_1'
But When I change the value of label mapping as the docs at text classification say it didn’t work fine. Why the model cannot take the label mapping from dataset?
from datasets import load_dataset
from evaluate import evaluator
from transformers import AutoModelForSequenceClassification, pipeline
data = load_dataset("indonlu","smsa", split="test").shuffle(seed=42).select(range(500))
task_evaluator = evaluator("text-classification")
eval_results = task_evaluator.compute(
model_or_pipeline="mdhugol/indonesia-bert-sentiment-classification",
data=data,
#this also failed
label_mapping={'LABEL_0': 'positive', 'LABEL_1': 'neutral', 'LABEL_2': 'negative'}
#this also did't work
label_mapping={"NEGATIVE": 0, "POSITIVE": 1}
#this throw an error even same as the dataset label
#label_mapping={"positive": 0, "neutral": 1,"negative": 2}
)