Build a title recommender for scientific articles

:wave: Please read the topic category description to understand what this is all about


If you’ve ever worked in research, then you know that picking a catchy title for your articles is not easy! In this project, we’ll train a summarization model that can convert abstracts into titles.


Several of the Summarization models on the Hub should serve as a good starting point for this project.


There are several scientific datasets available on the Hub:


Training a model that can summarise across any scientific domain is unlikely to be feasible. We recommend picking one subdomain (e.g. high-energy physics) and focusing your efforts there.

Desired project outcomes

  • Create a Streamlit or Gradio app on :hugs: Spaces that can generate titles from a given abstract.
  • Don’t forget to push all your models and datasets to the Hub so others can build on them!

Additional resources

Discord channel

To chat and organise with other people interested in this project, head over to our Discord and:

  • Follow the instructions on the #join-course channel
  • Join the science-title-generator channel

Just make sure you comment here to indicate that you’ll be contributing to this project :slight_smile:

1 Like

This looks like a great project and I would love to contribute to this.

1 Like

Hey @lewtun I think I’ll try to do this as well


Hey @shamikbose89 and @akshay7, cool to hear that you want to tackle this! I’ve created a Discord channel for this project (see topic description) in case you want to use it :slight_smile:

1 Like

We recommend picking one subdomain (e.g. high-energy physics) and focusing your efforts there.

This is a great starting point. @akshay7, do you have thoughts on which domain you’d like to focus on? I have a feeling if we pick cs.ML, it would just say “… is all you need”

Hey @shamikbose89 I am open to any of them, was think of cs.AI itself as it would be easier to manually check results too but something like might be fun too.
I also generated a list of all available subdomains in the dataset of Arxiv too:



Yeah, I think it would be good to give cs.AI or stat.ML a shot first. Let’s discuss more on the discord channel


Hi folks, I’d be happy to join this project. Is there one more free slot?


Sure thing!

I would like to contribute to this project too

1 Like

I would be contributing in this project

would love to contribute here

Lewis, looks like there’s quite a bit of interest in this project. Can you add @FrederikNiesner to the discord channel please?

Thanks. Hi @lewtun :point_up_2: can you add me to the channel?

@shamikbose89 @akshay7 do you have a git link you could already share? Around what time are you planning to work on this generally - I should be online around late afternoon Berlin Timezone (Noon ET)

I like to contribute, what are next steps

That time works for me. I’m on the US East Coast. Can you join the discord channel for this project?

@shainaraza check out the FAQ for next steps :slight_smile:

Hey @FrederikNiesner, if you follow the instructions on the #join-course channel of our Discord you should automatically find the Discord channel for this project (see the topic description)

Thanks @lewtun I figured it out.

1 Like

Our project demo is ready and running on Spaces. You can find the link here:

1 Like