Speech to text model with tensorflow?

Hi, I am looking for a tensorflow model that is capable of converting an audio file to text. Can we do this with tensorflow and/or huggingface? The only models I find on the hub are for pytorch… :weary:

Thanks!

If you are looking for inference with TF based speech to text model, Here is TFwav2vec2 or are you looking for fine-tuning a TF based model? You can look Here for that.

1 Like