My goal is to apply the best model(s) to conduct on-premise ASR with speaker diarization. I am relatively new to ML and AI, and have had some success in applying huggingface models via pipeline in on-premise setups. I am currently using Anaconda, Spyder IDE, and Jupyter as needed. I can’t seem to get a grasp of asr with diarization. Can someone point me in the direction of a good tutorial with examples that can help advance my goal? Thank you in advance.
There is some information in this model card, which may be helpful to you
BTW, we have a model tag for speaker-diarization.