Hi, I created a script to generate additional sound files using google text to speach service. The quality of google wavenet sound files are really good that I think it could be used as additional sound files to the Common Voice dataset for the training. Here is the link to the script if you want to try it
This is interesting! What languages have you tried it with? I’m particularly curious about lower-resource languages - For example I once tried Google Translate with Swahili, and the speech synthesis on that was really robotic-sounding.
I tried only Google TTS for Indonesian. There are Standard and Wavenet voice types. The Standard voice sounds still robotic as you mentioned, but the Wavenet voice is already very good, you have to try it Supported voices and languages | Cloud Text-to-Speech Documentation. Unfortunately I don’t see your language Swahili in the list of supported language.