How to create a dataset from a CSV for transcription

Hello, I have a CSV file called mapping.csv
What are the columns that should this file have to fine-tune Whisper?
I had audio and sentence but I got:

The following columns in the training set don't have a corresponding argument in WhisperForConditionalGeneration.forwardand have been ignored: audio, sentence. If audio, sentence are not expected byWhisperForConditionalGeneration.forward, you can safely ignore this message.

1 Like

You can try having a column audio with the audio file path, and a column sentence for the transcription.

Don’t forget to cast the audio column from string to audio type:

from datasets import Audio

ds = ds.cast_column("audio", Audio(sampling_rate=sampling_rate))