Pyannotate pipeline() not working

I am using pyannotate for speaker diarization on top of Whisper. However, even though I have provided the token credential, the code consistently gets stuck at the pipeline() call. This eventually leads to the Colab session disconnecting. The video in question is relatively short, only 11:46 minutes. I have attached the code for your reference.

from pyannote.audio import Pipeline
wav_file=“download1.wav”

Load the pretrained diarization pipeline

pipeline = Pipeline.from_pretrained(“pyannote/speaker-diarization”, use_auth_token=“hf_xxxxxxxxxxxxxxxxxxx”
)

Apply the pipeline to the audio fie

diarization = pipeline(wav_file) ######this is where code is getting stuck

Save the diarization output to a file

with open(“diarization.txt”, “w”) as f:
for turn, _, speaker in diarization.itertracks(yield_label=True):
f.write(f"{turn.start:.2f} - {turn.end:.2f}: Speaker {speaker}\n")

print(“Speaker diarization completed.”)

1 Like

I wonder if the Pyannotate settings are wrong…

I am still stuck at this point. Even though I have been granted model permission, it’s not running and keeps on getting stuck at pipeline(“download1.wav”). I have accepted the user conditions as well. I need urgent help

1 Like

Thanks for the reply @John6666 . But the issue remains the same, Even though I have followed all the steps mentioned on the shared post.

1 Like

I thought it was possible that your GPU wasn’t being used. How about this?

# https://github.com/pyannote/pyannote-audio
from pyannote.audio import Pipeline
import torch

hf_token = "hf_xxxxxxxxxxxxxxxxxxx"
device = "cuda" if torch.cuda.is_available() else "cpu"
wav_file="download1.wav"

#Load the pretrained diarization pipeline
pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization", use_auth_token=hf_token).to(device) # send pipeline to GPU (when available)
# pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization", device_map="auto", use_auth_token=hf_token) # if your GPU is weak... ensure "pip install -U accelerate" before use it

# Apply the pipeline to the audio fie
diarization = pipeline(wav_file) ######this is where code is getting stuck

#Save the diarization output to a file
with open("diarization.txt", "w") as f:
    for turn, _, speaker in diarization.itertracks(yield_label=True):
       f.write(f"{turn.start:.2f} - {turn.end:.2f}: Speaker {speaker}\n")

print("Speaker diarization completed.")

The pyannote doesn’t ever starts with 0 is there any reason ?

1 Like

It looks like your pipeline() call is hanging. Here are a few things to check:

  1. Check Token: Make sure your Hugging Face token is valid and has the right permissions.
  2. Audio File: Double-check that the audio file path (download1.wav) is correct and accessible. Try with a smaller file to see if that helps.
  3. Colab Resources: Colab has limited resources. You can check the GPU memory usage with:
    !nvidia-smi
    
    If it’s too high, Colab might disconnect.
  4. Debugging: Add print statements to check where the code is stopping:
    print("Loading model...")
    pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization", use_auth_token="hf_xxxxxxxxxxxxxxxxxxx")
    print("Model loaded, applying pipeline...")
    diarization = pipeline(wav_file)
    print("Diarization completed.")
    
  5. Run Locally: If it still doesn’t work, try running the code locally or on a different platform (AWS, GCP).

If these steps don’t help, you might want to try a different diarization library like SpeechBrain or pyAudioAnalysis.

Good luck!

1 Like