Using inference api on espnet/kan-bayashi_ljspeech_vits model

Hi, how can I use the inference api on this model: espnet/kan-bayashi_ljspeech_vits?
It receives text and should return audio file, since it is text to speech model.