AI to Convert Any Voice to a Specific Voice

ryuuft · September 15, 2024, 12:42pm

Hey, everyone! How’s it going?

Lately, I’ve been exploring artificial intelligence used to create music, separate instrumentals from songs, among other things. Because of that, I became interested in training my own AI, where the main idea is for it to receive an audio input (preferably vocals only) and transform the input voice into a fixed target voice.

I work in data science, but my focus has been mostly on natural language models, so I wanted to see if there’s anyone willing to give me some tips about the audio field, haha

What are the most commonly used models in this area?
When training the model, would it be ideal to generate a dataset with the target voice in various scenarios, or could I limit it to just a “spoken” dataset, for example?
Any tips?

Thanks in advance!

John6666 · September 15, 2024, 1:02pm

If it’s text to voice, I think RVC2 is the most major one.
There are all sorts of resources on external sites.

I’m not too familiar with voice changer systems, but I think it’s in the collection. What’s the name of the model…?
Well, if you find a space with a similar purpose and look in requirements.txt, you should be able to find it.

But be aware that many of the voice-related libraries are from the last year. Build errors tend to occur quite often.

ryuuft · September 15, 2024, 3:00pm

So, I’m not very familiar with voice-changing systems either. What I’m testing are online software tools, like Suno and Moises. From using them, I became interested in practicing some finetuning in this area.

And thank you very much for your response, @John6666

John6666 · September 15, 2024, 3:47pm

This is the most common voice cloning I see at HF.

And also this.

RVC

applio

I think there was another…

BTW, I’ve used Suno too, but I guess I can’t match that guy’s performance…
Maybe they’ll catch up soon, but not yet.

By the way, if RVC-related models can be diverted, there is a lot of data lying around on the Internet, from animation to singers, like this.

ryuuft · September 15, 2024, 4:17pm

Now with all these places for me to check out, I’ll probably be able to explore a lot o/

I really appreciate it, this will definitely help!
And that guy, Suno, is out of this world. But getting something 30% close to that is already excellent.

John6666 · September 15, 2024, 4:29pm

I’m glad I could be of some help!

EXTRAS for you. nightmare stage.

Ali125 · October 12, 2024, 8:00am

I you want to convert your voice into ai voice the audiomodify.com is the best website. In which you can change your voice into another singer voice

DylanAndrew · November 7, 2024, 8:18am

That sounds like an awesome project! For voice transformation, models like Wave Net or Tacotron work great. For training, a varied dataset helps, but starting with just a spoken dataset can work too. Just make sure to refine it as you go for better results. Good luck!

ryuuft · November 9, 2024, 12:09pm

Thank you mate! I’ll sure take a look

ryuuft · November 9, 2024, 12:10pm

Thank you, Dylan! It really is a very interesting project, and looking back today, there are a lot of resources to test and see what can make my life easier haha

system · November 10, 2024, 12:11am

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Looking for a voice changer, where I can train models locally or within colab Beginners	4	7896	July 23, 2025
What is SOTA model to create Voice cloning for my voice Spaces	1	1223	October 18, 2023
Training a TTS Model on a Specific Character from a TV Show or Movie Models	0	566	February 29, 2024
What to do here? Beginners	2	315	June 19, 2024
Voice Cloning and accent fine-tuning Intermediate	1	36	July 9, 2025

Related topics