Additional sound files using speech synthesizer

cahya · March 26, 2021, 6:40am

Hi, I created a script to generate additional sound files using google text to speach service. The quality of google wavenet sound files are really good that I think it could be used as additional sound files to the Common Voice dataset for the training. Here is the link to the script if you want to try it

cdleong · April 1, 2021, 8:31pm

This is interesting! What languages have you tried it with? I’m particularly curious about lower-resource languages - For example I once tried Google Translate with Swahili, and the speech synthesis on that was really robotic-sounding.

cahya · April 1, 2021, 11:07pm

Hi,
I tried only Google TTS for Indonesian. There are Standard and Wavenet voice types. The Standard voice sounds still robotic as you mentioned, but the Wavenet voice is already very good, you have to try it Supported voices and languages | Cloud Text-to-Speech Documentation. Unfortunately I don’t see your language Swahili in the list of supported language.

Topic		Replies	Views
PreTrain Wav2Vec2 in Indonesian Flax/JAX Projects	1	364	June 29, 2021
Hindi ASR: Fine-Tuning Wav2Vec2 Languages at Hugging Face	19	2984	January 4, 2022
PreTrain Wav2Vec2 in Persian Flax/JAX Projects	0	1175	July 8, 2021
Japanese keyword audio dataset 🤗Datasets	3	253	April 1, 2025
Polish ASR: Fine-Tuning Wav2Vec2 Languages at Hugging Face	0	431	March 19, 2021

Additional sound files using speech synthesizer

Related topics