Why does PEGASUS generate summaries with <n> tags?

isollid · March 12, 2021, 3:43pm

Why does PEGASUS generate summaries with tags?

Here is how a have initialized the model and generate summaries:
from transformers import PegasusForConditionalGeneration, PegasusTokenizerFast, PegasusConfig

import torch

torch_device = 'cuda' if torch.cuda.is_available() else 'cpu'

pegasus_model = PegasusForConditionalGeneration.from_pretrained('google/pegasus-pubmed').to(torch_device)

pegasus_tokenizer = PegasusTokenizerFast.from_pretrained('google/pegasus-pubmed', max_position_embeddings=2048)

def pegasus_summarization(article):
  batch = pegasus_tokenizer.prepare_seq2seq_batch([article], truncation=True, padding='longest', max_target_length=250, return_tensors='pt').to(torch_device)
  translated = pegasus_model.generate(**batch)
  tgt_text = pegasus_tokenizer.batch_decode(translated, skip_special_tokens=True)
  return tgt_text[0]

And here is the resulting summary:

anxiety is the most prominent and prevalent mood disorder in parkinson’s disease ( pd ) ; however, little is known about the relationship between anxiety and cognition in pd. <n> the aim of this study was to examine the influence of anxiety on cognition in pd by directly comparing groups of pd patients with and without anxiety while excluding depression. <n> we hypothesized that pd patients with anxiety would show impairments in attentional set - shifting and working memory compared to pd patients without anxiety.

I used pegasus in October last year, but was not a problem then. Maybe it is something that came with the v4.0.0 release of transformers?

I found others that have experienced the same (https://github.com/eeic-ai-01/text2slide/blob/8af85b423f68b399b88292c8a08c2cbf5a744ea1/summarization/abstractive/summarizer/pegasus.py) ref the regex substitute of <n>-tags.

Appreciate all answers!

Topic		Replies	Views
How to generate a samples of summaries with Pegasus? Beginners	3	1011	October 16, 2023
Simple Model to rewrite/paraphrase Beginners	7	322	March 19, 2025
Questions about Pegasus for Summarization 🤗Transformers	1	787	August 24, 2020
PEGASUS extracting from input instead of abstrative summarization 🤗Transformers	0	270	June 16, 2021
Pegasus tokenizer for batch processing Beginners	1	2367	August 10, 2023

Why does PEGASUS generate summaries with <n> tags?

Related topics