How much data is needed for fine tuning a model for summarization?

anuragrawal · September 27, 2023, 7:16pm

Hi All,

I want to fine tune a summarization model on a custom dataset. Are there any guidelines around how much data I would need, will data from a different domain help, etc.?

I am trying to summarize conversations. In most cases, these conversations will involve just two people. I finetuned google/flan-t5-base and facebook/bart-large-cnn on about 1000 examples, results are good but not as good as GPT-3.5.

Do I need to gather and train on more data? If I don’t have access to data for my use case, can I use data from any other domain as long as they are conversations? Say, from podcasts?
For how long do I train the model for? Are there any best practices around choosing number of epochs, etc.?

I am looking to improve the performance of my model and can really use some help! I have looked online but can’t find a clear answer. I understand that in a lot of cases, you need to experiment what works for you but there are so many possibilities and I am looking for a starting point, as a beginner in this field.

Thank you for your help!

Topic		Replies	Views
Finetuning in multiple sequential training sessions rather than at once Models	1	998	December 26, 2023
Fine-tuning Dataset Requirements Beginners	1	425	September 6, 2020
Need help to find a dataset for fine tuning 🤗Datasets	0	136	May 21, 2024
Minimum size for Summarization Beginners	0	332	September 6, 2021
How to finetune a bert model to a Summarizer Beginners	2	4970	March 7, 2022

How much data is needed for fine tuning a model for summarization?

Related topics