Currently, there are only two models available for hate speech detection in the hugging face model hub. By pre-training a RoBERTa model, we wish to increase the accessibility to one of the oldest Indic languages.
A randomly Initialized RoBERTa model.
Here are some of the datasets containing Kannada sentences: preprocessing required.
A masked language modeling script for Flax is available here. Probably the same can be used.
To use this model and fine-tune it for a sentiment analysis task for Kannada text sentences.