Transformer for Abstractive Summarization for Chats Based on Performance

anant0308 · August 17, 2020, 1:01pm

Hi, I’ve some general questions related to Transfer Learning on pretrained models for summarization problem. I’ve been trying to engineer Seq2Seq model for Summarizing Chats between two user agents.
I’ve tried T5 model (Pretrained & Transfer Learning), but the results were not satisfactory. The summarized text missed the context entirely after training on the custom dataset.
Can someone please help me understand which model works better for summarizing chats or any pre-processing task that precedes this.
Thanks in advance.

yjernite · August 17, 2020, 3:26pm

Hi @anant0308 ! Happy to discuss possible approaches, but what works best (and whether you can expect good results at all) will depend on what your fine-tuning data looks like: for example, how long are the chats? do you have any gold summaries for your chats? do you have examples of summaries without corresponding chats? how many examples do you have? how are you representing speaker turns?

Keep in mind that summarizing chats is quite a different task from summarizing news text: if the pre-training data lacks any kind of dialogue inputs, then the model will have to learn how to interpret multi-turn structure from scratch, which will probably be your main challenge.

anant0308 · August 18, 2020, 5:33am

Hey @yjernite, the primary challenge as you mentioned is to identify the speaker and hence interpret the structure. The dataset is somewhat similar to (SAMsum corpus - https://arxiv.org/src/1911.12237v2/anc/corpus.7z).
The following are the key points that might help -

The summaries are there.
The chats are similar to normal texts exchanged between two users.
There are around 15K-20K training examples.
Currently, the speaker is represented as is. (Based on Name)

Kindly suggest the improvements for better implementation of abstractive summarization. Following are my key queries -

Is there any preferred model for chat summarization?
What might be the pre-processing steps for improvement in performance?
How should speakers be represented as it was found that the contexts might be changed because of a speaker name being present in a sentence (ambiguity increased) ?

Any suggestion would be of great help !

altozachmo · October 9, 2020, 3:06pm

Did you ever find an improvement?

I am trying to accomplish the sam thing with the SAMsum dataset

Topic		Replies	Views
Summarization : Conversation Beginners	1	3695	July 7, 2021
Can t5 transformer can be used to summarize conversations 🤗Transformers	1	443	January 19, 2021
Fine tune seq2seq with multiple output Beginners	0	635	January 19, 2021
Model says that it is suitable for summarization and returns error on project creation Beginners	0	219	July 10, 2023
Help Improving Abstractive Summarization 🤗Transformers	2	986	May 19, 2021

Transformer for Abstractive Summarization for Chats Based on Performance

Related topics