Which CNN Summarization models to use?

Hi @sshleifer,

For what task the following models were fine tuned / trained? Can be used for text summarization?

sshleifer/student_cnn_12_6
sshleifer/student_cnn_6_6

Thank you.

These student models are created by copying layers from bart-large-cnn to reduce their size. These are un fine-tuned checkpoints so you’ll need to fine-tune them for summerization. More details can be found here

2 Likes

Thank you!

Yes @valhalla is 100% correct.
More info:
Those are the starting point for distillation experiments
that result in distilbart https://docs.google.com/spreadsheets/d/1EkhDMwVO02m8jCD1cG3RoFPLicpcL1GQHTQjfvDYgIM/edit?usp=sharing

Thank you @sshleifer!

Hi Sam, @sshleifer

I would like to ask if I can find also such metrics for other models

e.g., airKlizz/, t5- and mrm8488/t5-base-finetuned-summarize-news

Thank you.

cc @mrm8488

1 Like

Add the metrics is in my TODO lists

1 Like

Yes. It is up to whoever uploaded the model to post their metrics. Please use rouge scores for summarization. Ideally use the nlp package (nlp.metrics('rouge') or the calculate_rouge_score function so that we can compare apples to apples, and make sure that beam search params are in your config!
Metrics that matter the most:
Rouge2 (fscore mid measure)
RougeL (fscore mid measure)

1 Like

Dear all!

Which of the following models can be used for summarizating English text out-of-the-box? Without the need to fine-tune any of them.

Thank you!

Here are the models:

google/pegasus-xsum
google/pegasus-cnn_dailymail
google/pegasus-large
google/pegasus-multi_news
google/pegasus-arxiv
google/pegasus-wikihow
google/pegasus-gigaword
google/pegasus-pubmed
google/pegasus-newsroom
google/pegasus-reddit_tifu
google/pegasus-billsum
google/pegasus-aeslc
google/roberta2roberta_L-24_bbc
google/roberta2roberta_L-24_gigaword
google/roberta2roberta_L-24_cnn_daily_mail
sshleifer/distill-pegasus-cnn-16-4
sshleifer/distill-pegasus-xsum-16-8
sshleifer/distill-pegasus-xsum-16-4
sshleifer/pegasus-cnn-ft-v2

patrickvonplaten/roberta_shared_bbc_xsum
patrickvonplaten/bert2bert_cnn_daily_mail
mrm8488/bert-small2bert-small-finetuned-cnn_daily_mail-summarization
yuvraj/xSumm
yuvraj/summarizer-cnndm

All of these models are fine-tuned for summarization, you could select the model based on your requirements like domain, model size, extractive/abstractive etc.

1 Like