Speech to Text concern

my organisation in the trouble my whole business is based on STT I need more accurate stt seamless m4t is not able to convert any audio fully I have little bit of noises audios. I have testing platform for student who is preparing for various exam in that I need to take answer of given question by student in audio form and for evaluating their answer I need to fully accurate text of those audio so I can analyse their grammar, mistakes so here I have used whisper in frontend but problem is whisper is doing auto correction and sometime stocking on one word and repeating again and again. I have used web speech api as well but it get stuck in between. I have huge amount of transcription thing in a month approx 80000 hours/ month. Can hugging face make some help . I want any solution in frontend can be helpful I just want whatever user speak wrong or right same in text without autocorrection or filler sententence and without sticking. Any input will be very helpful.