Bert-base for text classification and MLFlow

Petrina · November 4, 2024, 9:11am

Hello!

I am currently trying to fine-tune bert-base-uncased using a dataset for hate-speech detection, so I am loading the model using AutoModelForSequenceClassification function and the number of classes I have.

As I have seen in the documentation, the classifier with the suitable number of classification heads is loaded on top of Bert and the model can be fine-tuned for the downstream task of hate speech detection. Although the training is performed and completed without any errors, when I try to perform training with logging the experiment on MLflow the following error occurs:

mlflow.exceptions.RestException: INVALID_PARAMETER_VALUE: Changing param values is not allowed. Params were already logged=‘[{‘key’: ‘problem_type’, ‘old_value’: ‘None’, ‘new_value’: ‘single_label_classification’}]’ for run ID=‘e59139f9
7fb24f0b8242728892323d37’.

which means that originally the parameter problem_type is logged as None in MLflow and then it takes the value single_label_classification and cannot be logged due to different value.

Is this a bug? Is there any solution other than just hard-coding the parameter before logging it to MLflow?

Thank you in advance,
Petrina

John6666 · November 4, 2024, 1:29pm

It looks very similar to this bug, but either the remnants of the bug or the library version is out of date.

github.com/mlflow/mlflow

[BUG] INVALID_PARAMETER_VALUE: Changing param values is not allowed.

opened 03:10AM - 25 Oct 22 UTC

closed 06:57AM - 26 Oct 22 UTC

tonyreina

bug area/projects area/tracking

### Issues Policy acknowledgement - [X] I have read and agree to submit bug r…eports in accordance with the [issues policy](https://www.github.com/mlflow/mlflow/blob/master/ISSUE_POLICY.md) ### Willingness to contribute No. I cannot contribute a bug fix at this time. ### MLflow version 1.30.0 ### System information - **OS Platform and Distribution (e.g., Linux Ubuntu 16.04)**: WSL Ubuntu 20.04 - **Python version**: 3.10.6 - **yarn version, if running the dev UI**: N/A ### Describe the problem Autologging for TensorFlow (tf.keras) works when I run just `python train.py` but not when I run it from `mlflow run` on the MLproject (which uses the same `train.py` script). It appears that the autologger logs the state of the model during creation and this prevents it from updating the log values after training. Here's the error I get: `2022/10/24 19:46:15 WARNING mlflow.utils.autologging_utils: Encountered unexpected error during tensorflow autologging: INVALID_PARAMETER_VALUE: Changing param values is not allowed. Params were already logged='[{'key': 'validation_split', 'old_value': None, 'new_value': '0.0'}, {'key': 'shuffle', 'old_value': None, 'new_value': 'True'}, {'key': 'class_weight', 'old_value': None, 'new_value': 'None'}, {'key': 'sample_weight', 'old_value': None, 'new_value': 'None'}, {'key': 'initial_epoch', 'old_value': None, 'new_value': '0'}, {'key': 'steps_per_epoch', 'old_value': None, 'new_value': 'None'}, {'key': 'validation_steps', 'old_value': None, 'new_value': 'None'}, {'key': 'validation_batch_size', 'old_value': None, 'new_value': 'None'}, {'key': 'validation_freq', 'old_value': None, 'new_value': '1'}, {'key': 'max_queue_size', 'old_value': None, 'new_value': '10'}, {'key': 'workers', 'old_value': None, 'new_value': '1'}, {'key': 'use_multiprocessing', 'old_value': None, 'new_value': 'False'}]' for run ID='402712a4625a43bca38c0bce38fa4ed1'.` As you can see, the autolog apparently logged `None` for all of these values. Again, the same script works well when I run it outside of MLproject. ### Tracking information _No response_ ### Code to reproduce issue ```python """ TF/Keras Training script for MLFlow """ # mlflow run -e train_entry --env-manager=local --experiment-name=tony-reina-experiments . from datetime import datetime import os os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3" import click # pip install click import tensorflow as tf # pip install tensorflow # The following import and function call, # are the only additions to code required # to automatically log # metrics and parameters to MLflow. import mlflow # pip install mlflow EXPERIMENT_NAME = "tony-reina-experiments" def load_data(): """Load dataset and pre-process Fashion MNIST https://github.com/zalandoresearch/fashion-mnist 28x28 grayscale images of clothes from 10 different categories """ fashion_mnist = tf.keras.datasets.fashion_mnist (train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data() # Normalize the images from 0.0 to 1.0 train_images = train_images / 255.0 test_images = test_images / 255.0 # Human-readable class names class_names = [ "T-shirt/top", "Trouser", "Pullover", "Dress", "Coat", "Sandal", "Shirt", "Sneaker", "Bag", "Ankle boot", ] return train_images, train_labels, test_images, test_labels, class_names def create_model(parameters): """Create a simple TensorFlow Keras model Args: parameters(dict): Number of units, optimizer, and metrics for model """ model = tf.keras.Sequential( [ tf.keras.layers.Flatten(input_shape=(28, 28)), tf.keras.layers.Dense(parameters["num_units"], activation="relu"), tf.keras.layers.Dense(10), ] ) model.compile( optimizer=tf.keras.optimizers.Adam(learning_rate=parameters["learning_rate"]), loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), metrics=[parameters["metrics"]], ) return model @click.command(help="The base training script for MLFlow.") @click.option( "--num-units", default=128, type=int, help="Number of units in the dense layer" ) @click.option("--epochs", default=3, type=int, help="Number of training epochs") @click.option("--batch-size", default=32, type=int, help="Batch size") @click.option("--learning-rate", default=1e-4, type=float, help="Learning rate") @click.option("--metrics", default="accuracy", type=str, help="Model metric to track") @click.option( "--training-data", default=".", type=str, help="Path to the training data" ) @click.option("--testing-data", default=".", type=str, help="Path to the testing data") def train( num_units, epochs, batch_size, learning_rate, metrics, training_data, testing_data, ): """Run training""" train_images, train_labels, test_images, test_labels, class_names = load_data() # Instead of passing lots of variables, # we'll just pass a dictionary parameters = { "num_units": num_units, "num_epochs": epochs, "batch_size": batch_size, "learning_rate": learning_rate, "metrics": metrics, "training_data": training_data, "testing_data": testing_data, } click.secho(parameters) click.secho("Setting up MLflow tracking uri...") mlflow.tracking.set_tracking_uri(os.environ.get("MLFLOW_TRACKING_URI")) mlflow.set_experiment(experiment_name=EXPERIMENT_NAME) mlflow.tensorflow.autolog( log_models=True, silent=False, registered_model_name="ye_olde_mnist_fashion", ) current_time = datetime.now().strftime("%Y-%m-%d %H-%M-%S") click.secho("Starting the MLFlow Run...") model = create_model(parameters) with mlflow.start_run( #run_name=f"YeOldDemo-{current_time}", tags={"ImageTag": "local"}, description="Ye Olde Model Xample", ): model.fit( train_images, train_labels, epochs=parameters["num_epochs"], batch_size=parameters["batch_size"], ) click.secho("Finished training") test_loss, test_acc = model.evaluate(test_images, test_labels) mlflow.log_param(key="test_loss", value=test_loss) mlflow.log_param(key="test_acc", value=test_acc) mlflow.log_param(key="Class names", value=class_names) mlflow.log_param(key="TensorFlow version", value=tf.__version__) if __name__ == "__main__": train() ``` ### Stack trace ``` 2022/10/24 19:49:34 WARNING mlflow.utils.autologging_utils: Encountered unexpected error during tensorflow autologging: INVALID_PARAMETER_VALUE: Changing param values is not allowed. Params were already logged='[{'key': 'validation_split', 'old_value': None, 'new_value': '0.0'}, {'key': 'shuffle', 'old_value': None, 'new_value': 'True'}, {'key': 'class_weight', 'old_value': None, 'new_value': 'None'}, {'key': 'sample_weight', 'old_value': None, 'new_value': 'None'}, {'key': 'initial_epoch', 'old_value': None, 'new_value': '0'}, {'key': 'steps_per_epoch', 'old_value': None, 'new_value': 'None'}, {'key': 'validation_steps', 'old_value': None, 'new_value': 'None'}, {'key': 'validation_batch_size', 'old_value': None, 'new_value': 'None'}, {'key': 'validation_freq', 'old_value': None, 'new_value': '1'}, {'key': 'max_queue_size', 'old_value': None, 'new_value': '10'}, {'key': 'workers', 'old_value': None, 'new_value': '1'}, {'key': 'use_multiprocessing', 'old_value': None, 'new_value': 'False'}]' for run ID='a6fc78cd0973486aa6b0ddb5f36581ae'. ``` ### Other info / logs ``` 2022/10/24 19:49:34 WARNING mlflow.utils.autologging_utils: Encountered unexpected error during tensorflow autologging: INVALID_PARAMETER_VALUE: Changing param values is not allowed. Params were already logged='[{'key': 'validation_split', 'old_value': None, 'new_value': '0.0'}, {'key': 'shuffle', 'old_value': None, 'new_value': 'True'}, {'key': 'class_weight', 'old_value': None, 'new_value': 'None'}, {'key': 'sample_weight', 'old_value': None, 'new_value': 'None'}, {'key': 'initial_epoch', 'old_value': None, 'new_value': '0'}, {'key': 'steps_per_epoch', 'old_value': None, 'new_value': 'None'}, {'key': 'validation_steps', 'old_value': None, 'new_value': 'None'}, {'key': 'validation_batch_size', 'old_value': None, 'new_value': 'None'}, {'key': 'validation_freq', 'old_value': None, 'new_value': '1'}, {'key': 'max_queue_size', 'old_value': None, 'new_value': '10'}, {'key': 'workers', 'old_value': None, 'new_value': '1'}, {'key': 'use_multiprocessing', 'old_value': None, 'new_value': 'False'}]' for run ID='a6fc78cd0973486aa6b0ddb5f36581ae'. ``` ### What component(s) does this bug affect? - [ ] `area/artifacts`: Artifact stores and artifact logging - [ ] `area/build`: Build and test infrastructure for MLflow - [ ] `area/docs`: MLflow documentation pages - [ ] `area/examples`: Example code - [ ] `area/model-registry`: Model Registry service, APIs, and the fluent client calls for Model Registry - [ ] `area/models`: MLmodel format, model serialization/deserialization, flavors - [ ] `area/pipelines`: Pipelines, Pipeline APIs, Pipeline configs, Pipeline Templates - [X] `area/projects`: MLproject format, project running backends - [ ] `area/scoring`: MLflow Model server, model deployment tools, Spark UDFs - [ ] `area/server-infra`: MLflow Tracking server backend - [X] `area/tracking`: Tracking Service, tracking client APIs, autologging ### What interface(s) does this bug affect? - [ ] `area/uiux`: Front-end, user experience, plotting, JavaScript, JavaScript dev server - [ ] `area/docker`: Docker use across MLflow's components, such as MLflow Projects and MLflow Models - [ ] `area/sqlalchemy`: Use of SQLAlchemy in the Tracking Service or Model Registry - [ ] `area/windows`: Windows support ### What language(s) does this bug affect? - [ ] `language/r`: R APIs and clients - [ ] `language/java`: Java APIs and clients - [ ] `language/new`: Proposals for new client languages ### What integration(s) does this bug affect? - [ ] `integrations/azure`: Azure and Azure ML integrations - [ ] `integrations/sagemaker`: SageMaker integrations - [ ] `integrations/databricks`: Databricks integrations

github.com/mlflow/mlflow

`mlflow server`: Fix log batch and set tag validation for empty string values

mlflow:master ← dbczumar:fix_log_batch_validation

opened 05:04PM - 01 Jul 22 UTC

dbczumar

+60 -17

## Related Issues/PRs Closes #6177 ## What changes are proposed in this p…ull request? Fix a bug in `mlflow server` that causes `SetTag` and `LogBatch` requests to be erroneously rejected if empty strings are used as parameter / tag values. ## How is this patch tested? Added integration tests ## Does this PR change the documentation? - [X] No. You can skip the rest of this section. - [ ] Yes. Make sure the changed pages / sections render correctly by following the steps below. 1. Check the status of the `ci/circleci: build_doc` check. If it's successful, proceed to the next step, otherwise fix it. 2. Click `Details` on the right to open the job page of CircleCI. 3. Click the `Artifacts` tab. 4. Click `docs/build/html/index.html`. 5. Find the changed pages / sections and make sure they render correctly. ## Release Notes Fix a bug in `mlflow server` that causes `SetTag` and `LogBatch` requests to be erroneously rejected if empty strings are used as parameter / tag values. ### Is this a user-facing change? - [ ] No. You can skip the rest of this section. - [X] Yes. Give a description of this change to be included in the release notes for MLflow users. (Details in 1-2 sentences. You can just refer to another PR with a description if this PR is part of a larger change.) ### What component(s), interfaces, languages, and integrations does this PR affect? Components - [ ] `area/artifacts`: Artifact stores and artifact logging - [ ] `area/build`: Build and test infrastructure for MLflow - [ ] `area/docs`: MLflow documentation pages - [ ] `area/examples`: Example code - [ ] `area/model-registry`: Model Registry service, APIs, and the fluent client calls for Model Registry - [ ] `area/models`: MLmodel format, model serialization/deserialization, flavors - [ ] `area/pipelines`: Pipelines, Pipeline APIs, Pipeline configs, Pipeline Templates - [ ] `area/projects`: MLproject format, project running backends - [ ] `area/scoring`: MLflow Model server, model deployment tools, Spark UDFs - [X] `area/server-infra`: MLflow Tracking server backend - [x] `area/tracking`: Tracking Service, tracking client APIs, autologging Interface - [ ] `area/uiux`: Front-end, user experience, plotting, JavaScript, JavaScript dev server - [ ] `area/docker`: Docker use across MLflow's components, such as MLflow Projects and MLflow Models - [ ] `area/sqlalchemy`: Use of SQLAlchemy in the Tracking Service or Model Registry - [ ] `area/windows`: Windows support Language - [ ] `language/r`: R APIs and clients - [ ] `language/java`: Java APIs and clients - [ ] `language/new`: Proposals for new client languages Integrations - [ ] `integrations/azure`: Azure and Azure ML integrations - [ ] `integrations/sagemaker`: SageMaker integrations - [ ] `integrations/databricks`: Databricks integrations  <a name="release-note-category"></a> ### How should the PR be classified in the release notes? Choose one: - [ ] `rn/breaking-change` - The PR will be mentioned in the "Breaking Changes" section - [ ] `rn/none` - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section - [ ] `rn/feature` - A new user-facing feature worth mentioning in the release notes - [X] `rn/bug-fix` - A user-facing bug fix worth mentioning in the release notes - [ ] `rn/documentation` - A user-facing documentation change worth mentioning in the release notes

Petrina · November 21, 2024, 10:08am

So, finally I found a workaround avoiding hard coding the model’s config. There is a parameter in AutoModelForSequenceClassification.from_pretrained(), with which you can specify the problem type.

So in my case I used AutoModelForSequenceClassification.from_pretrained( 'cardiffnlp/tweet_sentiment_multilingual', num_labels=3, problem_type='single_label_classification') which sets the problem type as single_label_classification from the beginning and it does not create a conflict when logging to MLflow creating the issue with different values of the same parameter.

A small note here because the single_label_classification tag confused me in the beginning. In hugging face any problem with one label per example is assigned as single_label_classification problem and as multi_label_classification problem for many labels per example.

Thank you,
Petrina

system · November 21, 2024, 10:09pm

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.

Topic		Replies	Views
MLflowCallback TypeError: can only concatenate list (not "type") to list Intermediate	3	1644	November 16, 2021
Default `problem_type` Beginners	2	3260	May 25, 2022
How to reset parameters from AutoModelFor SequenceClassification? 🤗Transformers	1	459	January 25, 2024
Some weights of BertModel were not initialized from the model checkpoint 🤗Transformers	6	11067	January 15, 2024
Upload Bert style classifier with mean pooling 🤗Transformers	0	349	August 25, 2023

Bert-base for text classification and MLFlow

Related topics