Hi @sshleifer,
For what task the following models were fine tuned / trained? Can be used for text summarization?
sshleifer/student_cnn_12_6
sshleifer/student_cnn_6_6
Thank you.
Hi @sshleifer,
For what task the following models were fine tuned / trained? Can be used for text summarization?
sshleifer/student_cnn_12_6
sshleifer/student_cnn_6_6
Thank you.
These student models are created by copying layers from bart-large-cnn to reduce their size. These are un fine-tuned checkpoints so you’ll need to fine-tune them for summerization. More details can be found here
https://github.com/huggingface/transformers/tree/master/examples/seq2seq#distilbart
Thank you!
Yes @valhalla is 100% correct.
More info:
Those are the starting point for distillation experiments
that result in distilbart https://docs.google.com/spreadsheets/d/1EkhDMwVO02m8jCD1cG3RoFPLicpcL1GQHTQjfvDYgIM/edit?usp=sharing
Thank you @sshleifer!
Hi Sam, @sshleifer
I would like to ask if I can find also such metrics for other models
e.g., airKlizz/, t5- and mrm8488/t5-base-finetuned-summarize-news
Thank you.
cc @mrm8488
Add the metrics is in my TODO lists
Yes. It is up to whoever uploaded the model to post their metrics. Please use rouge scores for summarization. Ideally use the nlp
package (nlp.metrics('rouge')
or the calculate_rouge_score
function so that we can compare apples to apples, and make sure that beam search params are in your config!
Metrics that matter the most:
Rouge2 (fscore mid measure)
RougeL (fscore mid measure)
Dear all!
Which of the following models can be used for summarizating English text out-of-the-box? Without the need to fine-tune any of them.
Thank you!
Here are the models:
google/pegasus-xsum
google/pegasus-cnn_dailymail
google/pegasus-large
google/pegasus-multi_news
google/pegasus-arxiv
google/pegasus-wikihow
google/pegasus-gigaword
google/pegasus-pubmed
google/pegasus-newsroom
google/pegasus-reddit_tifu
google/pegasus-billsum
google/pegasus-aeslc
google/roberta2roberta_L-24_bbc
google/roberta2roberta_L-24_gigaword
google/roberta2roberta_L-24_cnn_daily_mail
sshleifer/distill-pegasus-cnn-16-4
sshleifer/distill-pegasus-xsum-16-8
sshleifer/distill-pegasus-xsum-16-4
sshleifer/pegasus-cnn-ft-v2
patrickvonplaten/roberta_shared_bbc_xsum
patrickvonplaten/bert2bert_cnn_daily_mail
mrm8488/bert-small2bert-small-finetuned-cnn_daily_mail-summarization
yuvraj/xSumm
yuvraj/summarizer-cnndm
All of these models are fine-tuned for summarization, you could select the model based on your requirements like domain, model size, extractive/abstractive etc.