Hello every one. When I use map method to modify the data. Return a KeyError: ‘array’.
But I’m sure my dataset include this column.
print(ds_train)
print(ds_train[1]['audio']['array'])
And the output is:
Dataset({
features: ['audio', 'gender'],
num_rows: 16960
})
tensor([-8.2690e-14, -7.3000e-13, 1.5195e-13, ..., 8.2001e-07,
9.9102e-07, -3.9292e-07], device='cuda:0')
My code is:
feature_extractor = Wav2Vec2FeatureExtractor.from_pretrained("facebook/wav2vec2-base-960h")
def preprocess_function(examples):
print(examples)
audio_arrays = [x["array"] for x in examples["audio"]]
inputs = feature_extractor(
audio_arrays, sampling_rate=feature_extractor.sampling_rate, max_length=16000,
truncation=True, padding=True,
)
return inputs
ds_train = ds_train.map(preprocess_function, remove_columns="audio", batched=True)
I’m not sure what happend. Thank you in advance.