Bert-base for text classification and MLFlow

Hello!

I am currently trying to fine-tune bert-base-uncased using a dataset for hate-speech detection, so I am loading the model using AutoModelForSequenceClassification function and the number of classes I have.

As I have seen in the documentation, the classifier with the suitable number of classification heads is loaded on top of Bert and the model can be fine-tuned for the downstream task of hate speech detection. Although the training is performed and completed without any errors, when I try to perform training with logging the experiment on MLflow the following error occurs:

mlflow.exceptions.RestException: INVALID_PARAMETER_VALUE: Changing param values is not allowed. Params were already logged=‘[{‘key’: ‘problem_type’, ‘old_value’: ‘None’, ‘new_value’: ‘single_label_classification’}]’ for run ID=‘e59139f9
7fb24f0b8242728892323d37’.

which means that originally the parameter problem_type is logged as None in MLflow and then it takes the value single_label_classification and cannot be logged due to different value.

Is this a bug? Is there any solution other than just hard-coding the parameter before logging it to MLflow?

Thank you in advance,
Petrina

1 Like

It looks very similar to this bug, but either the remnants of the bug or the library version is out of date.

So, finally I found a workaround avoiding hard coding the model’s config. There is a parameter in AutoModelForSequenceClassification.from_pretrained(), with which you can specify the problem type.

So in my case I used AutoModelForSequenceClassification.from_pretrained( 'cardiffnlp/tweet_sentiment_multilingual', num_labels=3, problem_type='single_label_classification') which sets the problem type as single_label_classification from the beginning and it does not create a conflict when logging to MLflow creating the issue with different values of the same parameter.

A small note here because the single_label_classification tag confused me in the beginning. In hugging face any problem with one label per example is assigned as single_label_classification problem and as multi_label_classification problem for many labels per example.

Thank you,
Petrina

1 Like

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.