How to get data from hf dataset to readable format for whisper-timestamped

xbilek25 · March 11, 2024, 2:14pm

I am trying to use whisper timestamped (GitHub - linto-ai/whisper-timestamped: Multilingual Automatic Speech Recognition with word-level timestamps and confidence) to get word level timestamps on data from common voice dataset. But I am struggling to get audio to format for function load_audio. When I try to pass the audio I get following error: TypeError: expected str, bytes or os.PathLike object, not ndarray.
I tried to change to bytes but then got: ValueError: embedded null byte. I feel there must be some better way, but can’t figure this out. Can anyone suggast any hint or solution please?

Topic		Replies	Views
Whisper pipeline return_timestamps error Beginners	0	1529	March 4, 2023
Whisper fine-tuning and retaining timestamp decoding Models	5	1325	December 12, 2024
Fine tuning whisper on custom dataset Beginners	3	932	January 11, 2024
Convert from HF audio dataset to raw audio file 🤗Datasets	1	842	November 22, 2023
Whisper fine tuning on custom audio data Beginners	4	2730	February 15, 2025

How to get data from hf dataset to readable format for whisper-timestamped

Related topics