Using Pegasus Model for Transfer Learning is generating garbage summaries

anant0308 · September 18, 2020, 11:47am

Hi ,
I am using the PegasusForConditionalGeneration from transformers library for transfer learning and generate summaries of chats based on samsum dataset. But, the output generated is repetitive and not related at all to the context.

Also, getting loss as nan. Can someone help me regarding what went wrong?

Please note that the same code works perfectly fine for T5ForConditionalGeneration.
Is there any difference in the implementation method?
Should I try with other Loss Metrics or Optimizer?

valhalla · September 18, 2020, 12:03pm

Could you post the command you are using for fine-tuning ?

anant0308 · September 18, 2020, 12:24pm

I’m using a Python Code for the same. A snippet of which I’m sharing below -

       for _ in range(epochs):
            self.model.train()
            train_loss = 0
            for idx, data in tqdm(enumerate(self.train_loader)):
                self.optimizer.zero_grad()
                output = self.model(input_ids = data["input_ids"], attention_mask = data["attention_mask"], lm_labels = data["lm_labels"])
                loss, prediction_scores = output[:2]  
                train_loss += loss.item()
                loss.backward()
                self.optimizer.step()
                if((idx % 1000) == 0):
                    print("loss: ", loss.item(), " train_loss: ", train_loss/(idx+1))

As the name of each variable suggests, I’ve PegasusForConditionalGeneration in the variable self.model, Adam Optimizer in self.optimizer and self.train_loader is of type DataLoader.

Topic		Replies	Views
Fine-tuning Pegasus Models	33	10110	October 14, 2021
Improve the performance of model prediction of transformers model 🤗Transformers	3	2615	November 24, 2021
```google/pegasus-cnn_dailymail``` generates blank files Models	0	303	April 15, 2021
How to generate a samples of summaries with Pegasus? Beginners	3	1011	October 16, 2023
Simple Model to rewrite/paraphrase Beginners	7	326	March 19, 2025

Using Pegasus Model for Transfer Learning is generating garbage summaries

Related topics