I am using pyannotate for speaker diarization on top of Whisper. However, even though I have provided the token credential, the code consistently gets stuck at the pipeline() call. This eventually leads to the Colab session disconnecting. The video in question is relatively short, only 11:46 minutes. I have attached the code for your reference.
from pyannote.audio import Pipeline
wav_file=“download1.wav”
diarization = pipeline(wav_file) ######this is where code is getting stuck
Save the diarization output to a file
with open(“diarization.txt”, “w”) as f:
for turn, _, speaker in diarization.itertracks(yield_label=True):
f.write(f"{turn.start:.2f} - {turn.end:.2f}: Speaker {speaker}\n")
I am still stuck at this point. Even though I have been granted model permission, it’s not running and keeps on getting stuck at pipeline(“download1.wav”). I have accepted the user conditions as well. I need urgent help
I thought it was possible that your GPU wasn’t being used. How about this?
# https://github.com/pyannote/pyannote-audio
from pyannote.audio import Pipeline
import torch
hf_token = "hf_xxxxxxxxxxxxxxxxxxx"
device = "cuda" if torch.cuda.is_available() else "cpu"
wav_file="download1.wav"
#Load the pretrained diarization pipeline
pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization", use_auth_token=hf_token).to(device) # send pipeline to GPU (when available)
# pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization", device_map="auto", use_auth_token=hf_token) # if your GPU is weak... ensure "pip install -U accelerate" before use it
# Apply the pipeline to the audio fie
diarization = pipeline(wav_file) ######this is where code is getting stuck
#Save the diarization output to a file
with open("diarization.txt", "w") as f:
for turn, _, speaker in diarization.itertracks(yield_label=True):
f.write(f"{turn.start:.2f} - {turn.end:.2f}: Speaker {speaker}\n")
print("Speaker diarization completed.")