r/WSB Text Summarization

Summarizing Stonks!

Given the emergence of meme stocks, it would be cool to see a summarization of all the posts in the subreddit. Ideally, it would be nice to see what kind of things people are identifying in their analysis and why they like/dislike that particular stock.

2. Language

The model will be trained in English

3. Model

T5

4. Datasets

This is the tricky part as I don’t know if there’s an existing dataset for this. The main bottleneck would most likely be scraping reddit and curating the dataset so that we have the full length text and the summarizations.

5. Training scripts

We can make use of run_summarization.py

6. (Optional) Challenges

Curating the dataset as AFAIK it doesn’t exist at the moment.

7. (Optional) Desired project outcome

A demo that can summarize r/wsb posts to help identify the next meme stonk!