Build a news summarizer


A common data science task for many business is to be able to condense the news about their products or services into short summaries. The goal of this task is to fine-tune a model to automatically summarise news articles, ideally in a domain that is of interest to you!


There are various summarisation models on the Hub that have been fine-tuned on the famous CNN/Dailymail dataset. These provide a good starting point for performing domain adaptation:

There are also other summarization models that are worth investigating:


Using the summarization filter on the Hugging Face Hub gives a good list of datasets to start from.


Desired project outcomes

  • Create a Streamlit or Gradio app on :hugs: Spaces that can summarize news articles, either from their text or from a given URL.
  • Don’t forget to push all your models and datasets to the Hub so others can build on them!

Additional resources

Here are some existing spaces as inspiration:

Discord channel

To chat and organise with other people interested in this project, head over to our Discord and:

  • Follow the instructions on the #join-course channel

  • Join the #new-summarizer channel

Just make sure you comment here to indicate that you’ll be contributing to this project :slight_smile:


Hi Lewis, I am interested in working on a text summarizer as my course project. How can I join this project? - Ali

Hi @Alifarsi you can check out the Discord channel I created (see the project description) and get started with finding a suitable dataset / model for this task :slight_smile:

Is there any slot for this project?

Hey @aozorahime I suggest checking with the existing team in the Discord channel. If it’s full I can create a new team for you and others :slight_smile:

Interested , let me know any one is creating team, will join !

Hey @Saiteja you can check out if the team is full in the Discord channel. If it’s already 4 people, I can create a new team for you and others :slight_smile:

Thank you @lewtun !