I have this dataset consisting of a train, val and test set. I wish to rename the label column from the existing positive, neutral and negative to 0, 1 and 2. This is easily done with pandas, but I can’t figure out to do this in the huggingface dataset framework. Help?
Take a look at the map() function from datasets.Dataset.map.
I guess something like this should achieve what you want:
def map_labels(sample): label = sample["label"] sample["label"] = 0 if label == "positive" else 1 if label == "neutral" else 2 return sample result = dataset.map(map_labels)
Hi! The easiest/fastest way is to directly cast the label column to the
dset.cast_column("label", datasets.ClassLabel(names=["positive", "neutral", "negative"]))