Passing Trainer state as an artifact in kfp.v2 pipeline

santiagonasar · June 26, 2021, 5:32pm

I’m trying to create a kfp pipeline fine-tuning BERT on GCP’s VertexAI. I can save a model as an artifact, but have troubles saving the Trainer (Trainer’s state), as I would like to split eval and testing into two separate pipeline components and to achieve that - I need to reconstruct the Trainer in the latter.

I have, created trainer_artifact and would like to save_state() into tainer_artifact.path. However, save_state() does not accept arguments and saves to model’s directory by default. How can I save trainer as an artifact in this situation? Maybe there is a workaround for it, eg. recreating it in the next step?

I attach my code.

@component(
    packages_to_install = [
        "pandas",
        "datasets",
        "transformers"
    ],
)
def fine_tune_modell(
    small_train_dataset: Input[Dataset],
    small_eval_dataset: Input[Dataset],
#     full_train_dataset: Input[Dataset],
#     full_eval_dataset: Input[Dataset],
    model_artifact: Output[Model],
    trainer_artifact: Output[Artifact]
):
    
    import pandas as pd
    import numpy as np
    import datasets
    from transformers import AutoModelForSequenceClassification
    from transformers import TrainingArguments
    from transformers import Trainer
    from datasets import load_metric
    
    # create model
    model = AutoModelForSequenceClassification.from_pretrained("bert-base-cased", num_labels=2)

    # load data
    train_data = datasets.load_from_disk(small_train_dataset.path)
    eval_data = datasets.load_from_disk(small_eval_dataset.path)
    
    metric = load_metric("accuracy")

    def compute_metrics(eval_pred):
        logits, labels = eval_pred
        predictions = np.argmax(logits, axis=-1)
        return metric.compute(predictions=predictions, references=labels)
        
    training_args = TrainingArguments(
        output_dir="test_trainer",
        evaluation_strategy="epoch",
        per_device_train_batch_size=8,
        per_device_eval_batch_size=8,
        num_train_epochs=3,
        seed=0,)
    
    trainer = Trainer(
        model=model, 
        args=training_args, 
        train_dataset=train_data, 
        eval_dataset=eval_data,
        compute_metrics=compute_metrics
    )
    
    train_output = trainer.train()

    model_artifact.metadata["train_output"] = train_output
    model_artifact.metadata["framework"] = "Pytorch"
    
    # How to save trainer?
    
    model.save_model(model_artifact.path)```

sgugger · June 28, 2021, 12:07pm

Maybe you can move the state file created?

Topic		Replies	Views
Trouble saving and loading a finetuned model Beginners	1	303	July 7, 2024
How to use the model from the chapter "Fine-tuning a model with the Trainer API" Course	0	321	April 17, 2024
Train modell for Question Answering Intermediate	3	311	May 6, 2024
Saving model per some step when using Trainer Intermediate	3	9204	December 11, 2023
Subject: Issues with Custom Model Saving Behavior Using Trainer Class in LVLM Training 🤗Transformers	0	119	April 8, 2024

Passing Trainer state as an artifact in kfp.v2 pipeline

Related topics