How to set useful id2label and label2id in config.json using Trainer

TL;DR - when training, a config.json is created and I would like to know how to define useful label ids, instead of the default “LABEL_0”.

I’ve defined a dataset loader, which includes a label feature. I’m calling functions such as:

model = ResNetForImageClassification.from_pretrained(...)
train_dataset = load_dataset('custom/dataset/path', split='test')
trainer = CustomTrainer(model=model, ...)
trainer.train()

When it saves out checkpoints, it creates a config.json with default id2label labels, such as “0”: “LABEL_0”. How can I pass the real labels so that it ends up in the id2label and label2id in config.json?

I am facing the same issue. Pleasse let me know if you find a solution

# define mappings as dictionaries
id2label = {"0": "not toxic", "1": "toxic"}
label2id = {"not toxic": "0", "toxic": "1"}

# define config
config = AutoConfig.from_pretrained(model_name, label2id=label2id, id2label=id2label)
model = AutoModelForSequenceClassification.from_pretrained(model_name, config = config)

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.