I have fine tuned whisper small for Urdu using this huggingface post. The original is for Hindi, so basically I just changed “hi” to “ur” and it worked as there is similar amount of data for Urdu available on Mozilla Common Voice.
Now I wanna run the model locally using this code chunk (again from the above guide):
from transformers import pipeline
import gradio as gr
pipe = pipeline(model="sanchit-gandhi/whisper-small-hi") # change to "your-username/the-name-you-picked"
def transcribe(audio):
text = pipe(audio)["text"]
return text
iface = gr.Interface(
fn=transcribe,
inputs=gr.Audio(source="microphone", type="filepath"),
outputs="text",
title="Whisper Small Hindi",
description="Realtime demo for Hindi speech recognition using a fine-tuned Whisper small model.",
)
iface.launch()
However I am unable to understand how to specifiy the path to my local checkpoint-5000
folder or another folder where I saved the pre-trained model using trainer.save_model
. There are many posts online on how to load pre-trained (like this one). I always get error when using these methods feature_extractor
not present, or tokenizer or something else (these files are not present neither in checkpoint-5000
nor in whisper-small-ur
where I manually save using trainer.save_model
). Any help will be appreciated.