Problems in deployment when I configure my own labels

Hi.

I am training a binary classification model based on this model checkpoint: dccuchile/bert-base-spanish-wwm-cased

When deployment, it uses by default these labels: LABEL_0, LABEL_1.
My goal is to deploy it with my own labels. Browsing the forum I came across this thread, and it seems that resolve a similiar problem.

After read this thread, my code looks like:

label2id = {
        "0": "goodUser",
        "1": "badUserFraud"
    }
    
    id2label = {
        "goodUser": 0,
        "badUserFraud": 1
    }

    # download model from model hub
    config = AutoConfig.from_pretrained(args.model_name, label2id=label2id, id2label=id2label)
    model = AutoModelForSequenceClassification.from_pretrained(args.model_name, config=config)
    tokenizer = AutoTokenizer.from_pretrained(args.model_name)

The training step going well, but when deployment step I find this error:

Thanks in advance

I would like to know this aswell

Hey @Oigres,

Your configuration is vice-versa.
If you take a look at config.json · distilbert-base-uncased-finetuned-sst-2-english at main
So the id2label needs to have as key the id and as value the string.
And the label2id needs to have as key the string and as value the id .

id2label = {
        "0": "goodUser",
        "1": "badUserFraud"
    }

label2id = {
        "goodUser": 0,
        "badUserFraud": 1
    }

Easy to remember is the name is always key2Value.

3 Likes

Ohh! Seems true that the best debugging technique is taking a rest :sweat_smile:

Thank you very much

1 Like

thanks for the awesome information.

1 Like

thanks my issue has been fixed.

1 Like

Hi, I have a doubt here.

If I have already converted my labels to numbers and want to use the same, should I still create label2id and id2label?

I am trying to perform sequence classification in binary and multi-class settings.

Any discussion to help me understand this is greatly appreciated.

Thanks