I have seen that hugging face has a lot of pre-trained models for different languages. How can I use these models for my own application with some recorded voices. I mean, I want to build up an app in which we can press the button to record my voice and then the app gives back the text version of it. As far as I have seen to-date all the implementations have used the hugging face data set and I don’t know how to feed my own voice. Thank you
I don’t think the Huggingface models will work on recorded voices. So far as I know, they all expect text input. See the docs Quick tour — transformers 4.4.2 documentation
Incorrect. There are audio transcription models in here as well.
See an example here. There is a button “Record in browser” where you can use your own samples.