Which CNN Summarization models to use?

echatzikyriakidis · July 16, 2020, 9:05am

For what task the following models were fine tuned / trained? Can be used for text summarization?

sshleifer/student_cnn_12_6
sshleifer/student_cnn_6_6

Thank you.

valhalla · July 16, 2020, 2:13pm

These student models are created by copying layers from bart-large-cnn to reduce their size. These are un fine-tuned checkpoints so you’ll need to fine-tune them for summerization. More details can be found here
https://github.com/huggingface/transformers/tree/master/examples/seq2seq#distilbart

echatzikyriakidis · July 16, 2020, 2:50pm

Thank you!

sshleifer · July 16, 2020, 4:55pm

Yes @valhalla is 100% correct.
More info:
Those are the starting point for distillation experiments
that result in distilbart https://docs.google.com/spreadsheets/d/1EkhDMwVO02m8jCD1cG3RoFPLicpcL1GQHTQjfvDYgIM/edit?usp=sharing

echatzikyriakidis · July 17, 2020, 8:55am

Thank you @sshleifer!

echatzikyriakidis · July 17, 2020, 9:46am

Hi Sam, @sshleifer

I would like to ask if I can find also such metrics for other models

e.g., airKlizz/, t5- and mrm8488/t5-base-finetuned-summarize-news

Thank you.

valhalla · July 17, 2020, 10:31am

cc @mrm8488

mrm8488 · July 17, 2020, 10:44am

Add the metrics is in my TODO lists

sshleifer · July 17, 2020, 3:09pm

Yes. It is up to whoever uploaded the model to post their metrics. Please use rouge scores for summarization. Ideally use the nlp package (nlp.metrics('rouge') or the calculate_rouge_score function so that we can compare apples to apples, and make sure that beam search params are in your config!
Metrics that matter the most:
Rouge2 (fscore mid measure)
RougeL (fscore mid measure)

echatzikyriakidis · November 26, 2020, 12:33pm

Dear all!

Which of the following models can be used for summarizating English text out-of-the-box? Without the need to fine-tune any of them.

Thank you!

Here are the models:

google/pegasus-xsum
google/pegasus-cnn_dailymail
google/pegasus-large
google/pegasus-multi_news
google/pegasus-arxiv
google/pegasus-wikihow
google/pegasus-gigaword
google/pegasus-pubmed
google/pegasus-newsroom
google/pegasus-reddit_tifu
google/pegasus-billsum
google/pegasus-aeslc
google/roberta2roberta_L-24_bbc
google/roberta2roberta_L-24_gigaword
google/roberta2roberta_L-24_cnn_daily_mail
sshleifer/distill-pegasus-cnn-16-4
sshleifer/distill-pegasus-xsum-16-8
sshleifer/distill-pegasus-xsum-16-4
sshleifer/pegasus-cnn-ft-v2

patrickvonplaten/roberta_shared_bbc_xsum
patrickvonplaten/bert2bert_cnn_daily_mail
mrm8488/bert-small2bert-small-finetuned-cnn_daily_mail-summarization
yuvraj/xSumm
yuvraj/summarizer-cnndm

valhalla · December 11, 2020, 10:15am

All of these models are fine-tuned for summarization, you could select the model based on your requirements like domain, model size, extractive/abstractive etc.

Topic		Replies	Views
Best model to use for Abstract Summarization Beginners	1	1485	January 4, 2022
LM few shot and fine tuning on summarization task Beginners	1	1267	July 19, 2024
Which of the sshleifer/* models can be used as-is for text summarization? Beginners	5	459	July 15, 2020
How does summarization work with pretrained models? 🤗Transformers	0	590	November 14, 2023
Llama2 finetuning for summarization mlsum 🤗Transformers	0	450	August 29, 2023

Which CNN Summarization models to use?

Related topics