How to utilize a summarization model

theprincedrip · February 16, 2021, 9:50am

Hi, thank you for the reply and advice.
I forgot to mention that I want the summary to be simplistic as possible so even the average Joe would understand them. Hence that’s why I’m trying to clean up the legalese before feeding it to the summarizer.

So for the memory issue - I tried it via Google Collab with GPU and tried to utilize the Pegasus model.

Upon reaching this line - tokenizer = AutoTokenizer.from_pretrained(“google/pegasus-cnn_dailymail”, use_fast=False)
I got an error stating
"ValueError: Couldn’t instantiate the backend tokenizer from one of: (1) a tokenizers library serialization file, (2) a slow tokenizer instance to convert or (3) an equivalent slow tokenizer class to instantiate and convert. You need to have sentencepiece installed to convert a slow tokenizer to a fast one.

Then I found the docs and put use_fast=False and it didn’t work.
I also updated to the latest version of PIP(pip-21.0.1) - still the same error
I also downloaded this sentencepiece (Successfully installed sentencepiece-0.1.91) - still the same error persisted

Topic		Replies	Views
Finetuning Pegasus for summarization task 🤗Transformers	3	1067	October 14, 2020
How does summarization work with pretrained models? 🤗Transformers	0	608	November 14, 2023
Pegasus Questions 🤗Transformers	29	4008	July 5, 2021
Questions about Pegasus for Summarization 🤗Transformers	1	795	August 24, 2020
PEGASUS extracting from input instead of abstrative summarization 🤗Transformers	0	283	June 16, 2021

How to utilize a summarization model

Related topics