(How) should I pre-process my data for a transformer model used for classification (sentiment analysis)?

I understand that textual data often needs to be pre-processed before running a model on it. However, I’m not sure which pre-processing steps are appropriate (e.g., should I remove punctuation? should I make everything lowercase? should I lemmatize?)

Info about my particular case, in case it’s relevant:
I have a dataset of tweets in multiple languages. I want to use a multi-lingual model such as XLM-R to classify tweets according to their sentiment.