Hugging face model not transcribing the entire length of the audio file

Brief: I’m unable to transcribe more than a few seconds of audio in a 5 minute audio file using hugging face open ai whisper(finetuned) model.
I’m facing issues with transcribing a Indian local language audio file using this(thennal/whisper-medium-ml · Hugging Face) hugging face model. It is only transcribing the first few seconds but I would like to get the entire file transcribed. I’m trying this on google collab.

What I have tried?

First of all the code which only shows the first few seconds


Other code that I tried are using model max_new_tokens which resulted in even shorter transcription and I wasn’t able to go above 500.
I tried DEFAULT_INPUT_AUDIO_MAX_DURATION = 300 which resulted in an error.
I tried asking bing about this but it was just blurting things.
I even tried to deploy it on a space but the result is same everywhere.

What I want?

I would be grateful if someone could write the code for me, which transcribes the entire audio file no matter what length it is.