Seeking guidance on building a text-to-speech AI with custom voice morphing

gylotip · August 18, 2024, 5:54pm

I am interested in creating an AI tool that converts text into speech using a specific voice that I upload. To clarify, here is what I am aiming to achieve:

Input Specifications:
Text: For example, I love sleeping.
Voice Sample: A recording of someone’s voice saying something like This is your fault.
Output Specifications:
The output should be the text I love sleeping spoken in the voice sample provided.

In other words, I want to create a text-to-speech system where the output text is spoken in the voice of the provided audio sample.

Questions:

Is it feasible to build such an AI from scratch? What kind of technology or frameworks would be suitable for this project?
What programming languages or tools would you recommend? I am open to suggestions but ideally, I would like to use something that is effective for AI tasks.
Are there any existing resources or libraries that might help in building this type of voice morphing system?

As a precaution, I might not continue with this project if it is too complex.

Topic		Replies	Views
AI to improve voice Beginners	12	442	July 20, 2025
AI to Convert Any Voice to a Specific Voice Intermediate	10	7330	November 10, 2024
Speech to Speech Generative AI system 🤗Transformers	0	206	August 1, 2023
AI Voice Assistant Beginners	2	136	April 19, 2025
Embedding for my own voice Beginners	0	140	June 25, 2023

Seeking guidance on building a text-to-speech AI with custom voice morphing

Related topics