Hi ,
I am using the PegasusForConditionalGeneration
from transformers
library for transfer learning and generate summaries of chats based on samsum dataset. But, the output generated is repetitive and not related at all to the context.
Also, getting loss
as nan
. Can someone help me regarding what went wrong?
Please note that the same code works perfectly fine for T5ForConditionalGeneration
.
Is there any difference in the implementation method?
Should I try with other Loss Metrics or Optimizer?
Could you post the command you are using for fine-tuning ?
I’m using a Python Code for the same. A snippet of which I’m sharing below -
for _ in range(epochs):
self.model.train()
train_loss = 0
for idx, data in tqdm(enumerate(self.train_loader)):
self.optimizer.zero_grad()
output = self.model(input_ids = data["input_ids"], attention_mask = data["attention_mask"], lm_labels = data["lm_labels"])
loss, prediction_scores = output[:2]
train_loss += loss.item()
loss.backward()
self.optimizer.step()
if((idx % 1000) == 0):
print("loss: ", loss.item(), " train_loss: ", train_loss/(idx+1))
As the name of each variable suggests, I’ve PegasusForConditionalGeneration
in the variable self.model
, Adam
Optimizer in self.optimizer
and self.train_loader
is of type DataLoader
.