Fine-tuning distiBART

993 · October 17, 2020, 10:46am

Hi there,

I am not a native english speaker so please don´t blame me for the question. I am currently trying to figure out how I can fine-tune distilBART on some Financial Data (like finBERT). In the examples/seq2seq README it states:

For the CNN/DailyMail dataset, (relatively longer, more extractive summaries), we found a simple technique that works: you just copy alternating layers from bart-large-cnn and finetune more on the same data.

As far as I understand this sentence, I can Only finetune a distilBART student with the same data the teacher was trained with (CCN/DM) or can I use my own dataset that is completely different to the one that the BART Teacher was trained on?

Thanks in advance
Chris

sshleifer · October 17, 2020, 2:16pm

You can finetune distilbart on any data you want, the question is how well different approaches will perform.

Without knowing much more about the data and assuming you want to be able to train in <24h, I would probably start from sshleifer/distilbart-cnn-12-3.

993 · October 20, 2020, 6:34pm

Thanks for the reply.
I hope I will get some decent result if I managed to understand the whole fine-tuning process.

Topic		Replies	Views
Distilbart paper 🤗Transformers	17	2097	March 27, 2021
Trouble saving/loading fine-tuned BART model Models	1	883	December 1, 2021
Cannot reproduce the results Beginners	5	882	October 5, 2020
Finetuning in multiple sequential training sessions rather than at once Models	1	999	December 26, 2023
[Beginner] fine-tune Bart with custom dataset in other language? Beginners	2	3232	January 22, 2021

Fine-tuning distiBART

Related topics