Text wrangling before classification

DanDanDan · May 7, 2021, 5:45pm

Hi!
I’m performing sentiment classification on some sentences.
What I do is I fine-tune RoBERTa on a subset of my data.
My question is what text wrangling I should perform?
Should I lowercase, stem, remove punctuation?
With a simpler model (logistic regression, tree, etc.), I would definitely do all of these things, but I’m not sure what to do when I use a transformer.

Topic		Replies	Views
(How) should I pre-process my data for a transformer model used for classification (sentiment analysis)? Beginners	0	435	December 29, 2022
Fine-tuning Bert/Roberta for multi-label sentiment analysis Beginners	0	1596	November 8, 2021
Pipeline for sentiment classification 🤗Transformers	6	2204	November 3, 2020
Super Beginner to NLP. I am not sure if what i did is correct. Please help Beginners	0	331	April 13, 2023
Cost to fine tune large transformer models on the cloud? Beginners	1	1520	November 29, 2021

Text wrangling before classification

Related topics