Create a detector of toxicity from political tweets in Spain

:wave: Please read the topic category description to understand what this is all about

Description

The goal of this project is to automatically identify toxic speech emitted by politicians on Twitter. It is focused on Spain which is an interesting multilingual case with several co-official languages which are used interchangeably in politics.

Model(s)

Multilingual models like xlm-roberta-base.

Datasets

  • tweet_eval is a related resource, but it is English-only.

Challenges

  • Getting high-quality data in Spanish and/or integrating data in other languages.

Desired project outcomes

  • Create a Streamlit or Gradio app on :hugs: Spaces that is able to detect toxicity from tweets.

Discord channel

To chat and organise with other people interested in this project, head over to our Discord and:

  • Follow the instructions on the #join-course channel

  • Join the #toxic-tweets-es channel

Just make sure you comment here to indicate that you’ll be contributing to this project :slight_smile:

1 Like

I’d love to work on this project!

2 Likes

Hi! I’m also collaborating in this project