Can I change Text to Speech Inference API output

Hi there
I am trying out some TTS API’s and they all return FLAC audio. Is there a way to specify say MP3 or WAV? Reason is I would like to use TTS in Unity but Unity doesn’t support FLAC audio format.