I’m trying to use whisper to transcribe a wav file.
I’m drowning with the documentation and configurations.
- I want to use base or medium model.
- I want to be able to decide to run on CPU or GPU
- For test.wav file I want to get it’s transcription
- In addition to the transcription I want to get the encoder embeddings for this test.wav
Can you please help me and write the code for this ?