We are working on a summarization problem. The aim of the project is to take a webpage text as input and produce a summary (a sentence) that describes the business of the company.
We have a parallel corpus of about 2000 i.e., input webpage and expected summary. We are finetuning summarization model (sshleifer/distilbart-xsum-12-6) with 2000 samples we have. We are getting decent results using this approach.
Apart from the 2000 parallel corpus. We have a huge set of target summaries (around 500k). How can we use that huge target summaries to improve the summarization model.