According to: pyannote/speaker-diarization · Hugging Face,
the performacne of PyAnnote speaker diarization
on Ego4D
dataset is very bad (very high DER rate):
- Is there a reason for this bad results ?
- What is the different between Ego4D dataset and the other datasets that the performance is inferior ?