I’m trying to use
and I’m getting wrong number of speakers.
Any example I tried I got wrong results.
I used this youtube file:
I convert it to wav file with sample rate of 16000.
I run the following code:
from pyannote.audio import Pipeline TEST_FILE = "example.wav" MY_TOKEN = "..." pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization", use_auth_token=MY_TOKEN) diarization = pipeline(TEST_FILE)
And I got the following diarization:
- The GT contains 4 speakers and not 2.
How can I tweak
pyannote and get better results ?