Running google/pegasus-xsum model on a text column with articles for summarization

gs14 · March 16, 2022, 7:37pm

Hello Everyone!
I am new to using google/pegasus model and NLP, I was able to run the following code for individual text but I have a data frame which has a full column with speeches and I am trying to run the following model on that column and save the summary as another column generated by google, something like this:
df[‘summary’] = df[‘final’].apply(lambda x: lex_summarizer())

Google Pegasus code:
model_name = ‘google/pegasus-xsum’
torch_device = ‘cuda’ if torch.cuda.is_available() else ‘cpu’
tokenizer = PegasusTokenizer.from_pretrained(model_name)
model =PegasusForConditionalGeneration.from_pretrained(model_name).to(torch_device)
text = df[‘final’]
batch = tokenizer.prepare_seq2seq_batch(text, truncation=True, padding=‘longest’,return_tensors=‘pt’)
translated = model.generate(**batch)
pegasus_text = tokenizer.batch_decode(translated, skip_special_tokens=True)

Error: ValueError: text input must of type str (single example), List[str] (batch or single pretokenized example) or List[List[str]] (batch of pretokenized examples).

I would appreciate if someone could tell me how to run the above code for a text column in pandas dataframe and save the summary in another column?

Topic		Replies	Views
Google/pegasus-xsum for summerization is very slow Beginners	2	207	February 26, 2024
Pegasus tokenizer for batch processing Beginners	1	2370	August 10, 2023
How to generate a samples of summaries with Pegasus? Beginners	3	1011	October 16, 2023
PEGASUS extracting from input instead of abstrative summarization 🤗Transformers	0	270	June 16, 2021
Simple Model to rewrite/paraphrase Beginners	7	332	March 19, 2025

Running google/pegasus-xsum model on a text column with articles for summarization

Related topics