Worse output when using a bart summarizer directly (vs pipeline api)?

marcscattolin · October 14, 2024, 5:05pm

I realize that the issue was that I wasn’t using beam search decoding, or max length while generating. The code should be:

from transformers import pipeline # type: ignore
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer # type: ignore
import torch

checkpoint = "sshleifer/distilbart-cnn-12-6"
revision = "a4f8f3e"
summarizer_input = """
John Jeremy Thorpe (29 April 1929 – 4 December 2014) was a British politician who served as the Member of Parliament for North Devon from 1959 to 1979, and as leader of the Liberal Party from 1967 to 1976. In May 1979, he was tried at the Old Bailey on charges of conspiracy and incitement to murder his ex-boyfriend Norman Scott, a former model. Thorpe was acquitted on all charges, but the case, and the furore surrounding it, ended his political career.

Thorpe was the son and grandson of Conservative MPs, but decided to align with the small and ailing Liberal Party. After reading Law at Oxford University he became one of the Liberals' brightest stars in the 1950s. He entered Parliament at the age of 30, rapidly made his mark, and was elected party leader in 1967. After an uncertain start during which the party lost ground, Thorpe capitalised on the growing unpopularity of the Conservative and Labour parties to lead the Liberals through a period of electoral success. This culminated in the general election of February 1974, when the party won 6 million votes out of some 31 million cast. Under the first-past-the-post electoral system this gave them only 14 seats, but in a hung parliament, no party having an overall majority, Thorpe was in a strong position. He was offered a cabinet post by the Conservative prime minister, Edward Heath, if he would bring the Liberals into a coalition. His price for such a deal, reform of the electoral system, was rejected by Heath, who resigned in favour of a minority Labour government. 
"""

summarizer = pipeline("summarization", model=checkpoint, revision=revision, device_map="cuda")
result = summarizer(summarizer_input, min_length=4*10, max_length=4*15)[0]['summary_text']
print(result)


summarizer = AutoModelForSeq2SeqLM.from_pretrained(pretrained_model_name_or_path=checkpoint, revision=revision, device_map="cuda")
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
# Tokenize, then run thru model, then argmax to get vocab terms, then decode
inputs = tokenizer(text_target=summarizer_input, max_length=1024, truncation=True, padding=True, return_tensors="pt")
inputs = inputs.to("cuda")
outputs = summarizer.generate(inputs["input_ids"], num_beams=4, min_length=4*10, max_length=4*15)
print(tokenizer.batch_decode(outputs, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0])

Topic		Replies	Views
How does summarization work with pretrained models? 🤗Transformers	0	596	November 14, 2023
Summarization on long documents 🤗Transformers	63	59158	August 16, 2024
T5 outperforms BART when fine-tuned for summarization task Intermediate	3	4048	August 8, 2022
Output truncation of summaries models 🤗Transformers	0	442	March 30, 2023
Which summarization model of huggingface supports more than 1024 tokens? Which model is more suitable for programming related articles? 🤗Transformers	1	1774	July 31, 2023

Worse output when using a bart summarizer directly (vs pipeline api)?

Related topics