r/WSB Text Summarization

zpn · June 24, 2021, 1:08am

Summarizing Stonks!

Given the emergence of meme stocks, it would be cool to see a summarization of all the posts in the subreddit. Ideally, it would be nice to see what kind of things people are identifying in their analysis and why they like/dislike that particular stock.

2. Language

The model will be trained in English

3. Model

T5

4. Datasets

This is the tricky part as I don’t know if there’s an existing dataset for this. The main bottleneck would most likely be scraping reddit and curating the dataset so that we have the full length text and the summarizations.

5. Training scripts

We can make use of run_summarization.py

6. (Optional) Challenges

Curating the dataset as AFAIK it doesn’t exist at the moment.

7. (Optional) Desired project outcome

A demo that can summarize r/wsb posts to help identify the next meme stonk!

Topic		Replies	Views
Summarization - model for articles about finance Models	2	1035	January 12, 2021
Are WikinewsSum models for text summarization? Beginners	2	280	July 15, 2020
Super Beginner to NLP. I am not sure if what i did is correct. Please help Beginners	0	331	April 13, 2023
Best model for summarization Beginners	1	4439	April 6, 2023
Need help to find a dataset for fine tuning 🤗Datasets	0	136	May 21, 2024