Addition of a new language (Chadian Arabic ‘shu’) to the NLP, LLM models

Hi everyone ,

I’ve just started a project on the NLP of the Chadian Arabic language which has a few pronunciation differences with standard Arabic and the words are written in the Latin alphabet. For example, for the word ‘Comment tu vas-tu?’ its equivalent in Chadian Arabic is ‘inti keef’.
I’ve just put together a corpus of 3,000 words (Chadian Arabic and its translation), collecting mainly pdfs and some websites. So, I would like to have your guidance on the TTS, STT and LLM tools or other techniques that I need to adapt to carry out this project.
My aim in the first phase is to develop a TTS model that can recognise the local Chadian language and translate it into French and then to English.

I look forward to your guidance and thank you for your help.

1 Like