Embedding for my own voice

I worked out an example where i creating a speaker embedding for my own voice. I recorded some WAV and it kind of worked. However, while using that embedding, the output is like a pure robot. Am i missing something? should there be a particular speech?