In BertSUM paper they say that the summarization happens after the output of bert, they use a stack of specific summarization layers ( two layers of transformers worked best) then add a softmax layer to get which sentence should be included into the summmary. Now in their code there are classes for clustering, what’s the use of these classes/ methods? https://pypi.org/project/bert-extractive-summarizer/
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Transfer Learning an Extractive Summarization Bert Model such as BertSUM? | 0 | 1715 | September 27, 2021 | |
Extracting the output of hidden BERT layers and re-training the BERT model on custom datasets | 0 | 814 | March 17, 2021 | |
Extracting attention weights of summarization model | 0 | 439 | August 12, 2021 | |
How to finetune a bert model to a Summarizer | 2 | 5015 | March 7, 2022 | |
Why is there no pooler representation for XLNet or a consistent use of sequence_summary()? | 16 | 2210 | December 7, 2020 |