Worse output when using a bart summarizer directly (vs pipeline api)?

Hello,

I am interesting in fine-tuning Bart and other similar models for text summarization. I have found the pipelines API too restrictive (in particular, my inputs are occasionally over 1024 tokens long, and I would like to truncate those to size).

However, the quality of the output seems to be lower when using the model directly vs using the pipelines API. Here is a concrete example using “sshleifer/distilbart-cnn-12-6”:

from transformers import pipeline # type: ignore
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer # type: ignore
import torch


checkpoint = "sshleifer/distilbart-cnn-12-6"
revision = "a4f8f3e"
summarizer_input = """
John Jeremy Thorpe (29 April 1929 – 4 December 2014) was a British politician who served as the Member of Parliament for North Devon from 1959 to 1979, and as leader of the Liberal Party from 1967 to 1976. In May 1979, he was tried at the Old Bailey on charges of conspiracy and incitement to murder his ex-boyfriend Norman Scott, a former model. Thorpe was acquitted on all charges, but the case, and the furore surrounding it, ended his political career.

Thorpe was the son and grandson of Conservative MPs, but decided to align with the small and ailing Liberal Party. After reading Law at Oxford University he became one of the Liberals' brightest stars in the 1950s. He entered Parliament at the age of 30, rapidly made his mark, and was elected party leader in 1967. After an uncertain start during which the party lost ground, Thorpe capitalised on the growing unpopularity of the Conservative and Labour parties to lead the Liberals through a period of electoral success. This culminated in the general election of February 1974, when the party won 6 million votes out of some 31 million cast. Under the first-past-the-post electoral system this gave them only 14 seats, but in a hung parliament, no party having an overall majority, Thorpe was in a strong position. He was offered a cabinet post by the Conservative prime minister, Edward Heath, if he would bring the Liberals into a coalition. His price for such a deal, reform of the electoral system, was rejected by Heath, who resigned in favour of a minority Labour government. 
"""

summarizer = pipeline("summarization", model=checkpoint, revision=revision, device_map="cuda")
print(summarizer(summarizer_input, min_length=4*10, max_length=4*15)[0]['summary_text'])


summarizer = AutoModelForSeq2SeqLM.from_pretrained(pretrained_model_name_or_path=checkpoint, revision=revision, device_map="cuda")
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
# Tokenize, then run thru model, then argmax to get vocab terms, then decode
inputs = tokenizer(text_target=summarizer_input, max_length=1024, truncation=True, padding=True, return_tensors="pt")
inputs = inputs.to("cuda")
outputs = summarizer(**inputs)
tokens = torch.argmax(outputs.logits, dim=2)
print(tokenizer.batch_decode(tokens, skip_special_tokens=True))

The outputs I get are the following:

John Jeremy Thorpe was the son and grandson of Conservative MPs . He was elected leader of the Liberal Party from 1967 to 1976 . In May 1979, he was tried at the Old Bailey on charges of conspiracy and incitement to murder his ex-boyfriend Norman Scott, a former model

[" John John John Jeremy Thorpe was29 April 1929 – 4 December 2014) was a British politician. served as the Member of Parliament for North Devon from 1959 to 1979. and as leader of the Liberal Party from 1967 to 1976. In May 1979, he was tried at the Old Bailey on charges of conspiracy and incitement to murder his ex-boyfriend Norman Scott, a former model. Thorpe was acquitted on all charges, but the case, and the furore surrounding it, ended his political career. \xa0 \xa0 \xa0 \xa0 \xa0 \xa0 \xa0 \xa0.pe was the son and grandson of Conservative MPs, but decided to align with the small and ailing Liberal Party. reading Law at Oxford University he became one of the Liberals’ brightest stars in the 1950s entered Parliament at the age of 30, rapidly made his mark party leader in 1967 an uncertain start which the party lostpe capitalised on the unpopularity the and Labour parties the Liberals through a period of electoral success culminated in the general election of 1974 1974, when the party won 6 million votes out of some 31 million cast the first-past-the-post electoral system this gave them only seats in a parliament majoritype was in a strong position was a minister Heath"]

The second output, from me using the model directly, is less coherent, despite the fact that the same tokenizer and model were used (as far as I know). Is there some issue in the way I am handling the summarization? One thing I am wondering about is the min_length and max_length parameters of the pipeline; I don’t immediately see a way to recreate that behaviour while using the model directly.