Sentiment analysis of Sinhala language using deep learning networks

graw · June 30, 2021, 4:29pm

Sentiment analysis of Sinhala language using deep learning networks

1. Description

The main objective of the project is to test deep learning models to identify the sentiments in Sinhala text. A Facebook dataset is used to train and test the models. The model codes are already developed and only the training and testing phases remain to be done. Since Sinhala remains as a resource poor NLP language, this project will lend a hand to improve the current tools and provide insight on the current state.
Migrating the current code into JAX with the use of Flax, Haiku and other libraries is another objective. Libraries like Trax with basic deep learning models and the trending Transformers are aimed to be tested.

2. Language

The models are trained in Sinhala Language

3. Model

The models that will be tested are

RNN
LSTM
GRU
BiLSTM
Baseline models with the combination of a CNN
Stacked LSTM and BiLSTM
HAHNN
Capsule networks

4. Dataset

A Facebook dataset contaning 526,732 Sinhala and English posts extracted from CrowdTangle . The dataset consists of a decade’s worth of content from Facebook pages popular in Sri Lanka.

5. Training scripts

The following links contain the model scripts

Main models

6. Challenges

There are several models that needs to be adjusted and tested

7. Desired project outcome

Performance measures of each model

8. Reads

The following links can be useful to better understand the project and
what has been done previously.

https://sencat.lk/

Topic		Replies	Views
Pretrained GPT2 for Tamil Flax/JAX Projects	13	1086	July 12, 2021
Pretrain GPT-2 from scratch in Thai Flax/JAX Projects	0	920	July 18, 2021
PreTrain RoBERTa for Kannada Flax/JAX Projects	3	408	July 2, 2021
Super Beginner to NLP. I am not sure if what i did is correct. Please help Beginners	0	331	April 13, 2023
NLP in Sinhala language Languages at Hugging Face	1	743	April 28, 2023

Sentiment analysis of Sinhala language using deep learning networks