Create a SentenceTransformer in Dhivehi using ELECTRA

Description
Dhivehi is a low resource language. Since the available data is less, ELECTRA seems to be a good option as it requires less computing power and training data as compared to others.

Model
electra-small pretrained in dhivehi available here

Discord channel

To chat and organise with other people interested in this project, head over to our Discord and:

  • Follow the instructions on the #join-course channel

  • Join the #sentence-transformers-dhivehi channel

Just make sure you comment here to indicate that you’ll be contributing to this project :slight_smile:

He @ashraq thanks for proposing this interesting project! One question: what do you mean by create a “sentence transformer”? Are you talking about adding a pooling layer to the electra-small model and then training that on a Dhivevi corpus?

Do you also happen to have access / know of a Dhivevi corpus to train on?

Yes. Adding a pooling layer and fine tuning sentence transformer for dhivehi.

Yes there is a corpus available. In fact i have been preparing data for this task since last week before I came to know about the event. So i think this will be a wonderful opportunity.

This sounds like a great project indeed! I’ve created a Discord channel (see topic description) in case you and others want to use it

1 Like

btw are there any limitations on the instance we can choose on aws sage maker during this event?

As far as I know you can choose a p3 instance if it’s available or a T4 if not :slight_smile: