Hi,
I’m trying to understand if there are any settings that can be adjusted in the summarizer?
I tried changing various values, like top_p, top_k, etc…, but the only thing that seems to have any effect on output is max_length
, which just cuts the summary off abruptly.
Is there any way to control the generation of the summary?
sample code I’ve tried:
(note, I tried this on various models, such as google/pegasus-cnn_dailymail
, t5-base
, facebook/bart-large-cnn
)
text = '<sample text here>'
model = "google/pegasus-cnn_dailymail"
summarizer = pipeline("summarization", model=model)
print(summarizer(txt,
temperature=3,
num_beams=5,
no_repeat_ngram_size=3,
top_p=.6,
max_length=80,
do_sample=False,
truncation = True))
# first version gives same result as below, except it gets truncated
print(summarizer(txt,
truncation = True))