Description
Dhivehi is a low resource language. Since the available data is less, ELECTRA seems to be a good option as it requires less computing power and training data as compared to others.
Model
electra-small pretrained in dhivehi available here
Discord channel
To chat and organise with other people interested in this project, head over to our Discord and:
Follow the instructions on the #join-course channel
Join the #sentence-transformers-dhivehi channel
Just make sure you comment here to indicate that you’ll be contributing to this project
He @ashraq thanks for proposing this interesting project! One question: what do you mean by create a “sentence transformer”? Are you talking about adding a pooling layer to the electra-small model and then training that on a Dhivevi corpus?
Do you also happen to have access / know of a Dhivevi corpus to train on?
Yes. Adding a pooling layer and fine tuning sentence transformer for dhivehi.
Yes there is a corpus available. In fact i have been preparing data for this task since last week before I came to know about the event. So i think this will be a wonderful opportunity.