Finetuning BART using custom loss

himanshu · March 1, 2021, 7:21pm

Hi everyone,

I want o fine tune BART using custom loss. What I want to do is take the output text generated by the BART model, feed it to a classifier and update weights of the BART model using the classification loss. Please note that I do not want to train the classifier, rather I want to train the BART model using the classification loss on the generated text. Can someone give me pointers on how to do it?

TIA

lewtun · March 1, 2021, 8:22pm

Hi @himanshu, the simplest way to implement custom loss functions is by subclassing the Trainer class and overriding the compute_loss function, e.g.

from transformers import Trainer

class BartTrainer(Trainer):
    def compute_loss(self, model, inputs):
        # implement custom logic here
        custom_loss = ...
        return custom_loss

You can find more details in the docs here: Trainer — transformers 4.3.0 documentation

theudster · March 2, 2021, 9:44am

Hi, this may be a bit naive, but do you then pass on the parameters to BartTrainers as you would to Trainer, as such;

trainer = BartTrainer(
    model=model,                         # the instantiated Transformers model to be trained
    args=training_args,                  # training arguments, defined above
    train_dataset=train_dataset,         # training dataset
    eval_dataset=val_dataset,             # evaluation dataset
   compute_metrics=compute_metrics,
)

lewtun · March 2, 2021, 10:02am

yep, that’s it - by subclassing the Trainer class, BartTrainer inherits all the attributes and functions

strictly speaking, you might need to include super() in the subclass e.g.

class BartTrainer(Trainer):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

    def compute_loss(self, model, inputs):
        # implement custom logic here
        custom_loss = ...
        return custom_loss

there’s a nice template of a custom trainer for question-answering here that you could work from: transformers/trainer_qa.py at master · huggingface/transformers · GitHub

himanshu · March 2, 2021, 11:22am

Hi @lewtun thanks for the reply. Actually, I am looking for how to compute the loss i.e. how to “implement custom logic”, what I mean by that is the output of BART is some text, I want to feed that text to the classifier and update weights of BART based on classification loss. How do I code such a system. I hope that make sense .
TIA

lewtun · March 2, 2021, 12:33pm

Hi @himanshu, could you provide a code snippet of what you have in mind (even if it’s pseudo-code)? For example, can you show what the inputs / outputs to the classifiers should look like and how the classification loss is computed?

himanshu · March 2, 2021, 5:35pm

Hi @lewtun, I have something like this in mind.

def compute_loss(self,model,inputs,classifier):
    output = model.generate(inputs)
    text = tok.decode(output)
    # convert text to ids
    classifier_output = classifier(text)
    loss = loss_function(classifier_output, targets)
    return loss

Now, when I do loss.backward(), I want to update the weights of model and not classifier. I understand that there is non-differentiability involved as soon as I use the model to generate text, I want to know how to handle that?

minji · June 15, 2022, 1:15pm

@lewtun @himanshu Is there any idea about that?

halamvac · September 24, 2022, 9:34am

@himanshu @minji Did any of you make it work?

minji · September 25, 2022, 7:19am

I did it to use pytorch lightening
and change this part

outputs = self(batch[0], batch[1], batch[2])
loss = outputs["loss"]

y_hat, y = outputs["logits"].view(-1, outputs["logits"].shape[-1]), batch[2].view(-1)
cross_entropy_loss = F.cross_entropy(y_hat, y, reduction="mean") # you can change it to your own custom loss

Topic		Replies	Views
Passing output of BART to another model Beginners	0	231	September 24, 2022
Creating a custom loss function for token appearance based in BART on the input Intermediate	0	440	February 11, 2022
What can cause model.generate (BART) output to be gibberish after fine-tuning? Beginners	3	4409	August 31, 2020
Custom Training Loss Function for Seq2Seq BART Beginners	1	1720	July 21, 2023
Fine-Tune BART using "Fine-Tuning Custom Datasets" doc Beginners	6	9323	October 28, 2020

Finetuning BART using custom loss

Related topics