How to create custom ClassLabels?

I would like to turn a column in my dataset into ClassLabels.
For my use case, i have a column with three values and would like to map these to the class labels.
Creating the labels and setting the column is fairly straightforward:

# "basic_sentiment holds values [-1,0,1]
feat_sentiment = ClassLabel(num_classes = 3,names=["negative", "neutral", "positive"])
dataset = dataset.cast_column("basic_sentiment", feat_sentiment)

Now ClassLabel has three labels: 0 - negative, 1 - neutral, 2-positive, while the data still has values -1 to 1.
How can i set the ClassLabels to use the labels in the columns? Or do i have to set my column to fit the labels of the ClassLabel now?
I couldn’t find an in-depth explanation on how to use this feature in huggingset dataface.

Hi! Use map to map the column to the new range and specify features to perform the cast:

features = dataset.features.copy()
features["basic_sentiment"] = ClassLabel(names=["negative", "neutral", "positive"])
def adjust_labels(batch):
    batch["basic_sentiment"] = [sentiment + 1 for sentiment in batch["basic_sentiment"]]
    return batch
dataset = dataset.map(adjust_labels, batched=True, features=features)
4 Likes

thanks for your reply!
So it is not possible to change the internal labeling of the class labels then?
→ i have to change my column to fit the ClassLabel ordering?

Yes, currently only the [0, ..., num_classes - 1] range is supported when casting to ClassLabel.

1 Like