Hello,
I am using data collator function from [this tutorial (Fine-Tune Wav2Vec2 for English ASR)]. (Fine-Tune Wav2Vec2 for English ASR in Hugging Face with 🤗 Transformers) I would like to pad the input tensors to a max length in the data collator part using Wav2Vec2 processor.
First option is, setting the padding as True and setting the max_length as following:
batch = self.processor.pad(
input_features,
padding=True,
max_length=1000,
return_tensors="pt",
)
but it doesn’t give me desired 1000 long vectors, instead it pads them to their max value. So I change the parameters as following:
batch = self.processor.pad(
input_features,
padding='max_length',
max_length=1000,
return_tensors="pt",
)
This results in the following error:
ValueError: Unable to create tensor, you should probably activate padding with 'padding=True' to have batched tensors with the same length.
How should I set the padding parameter so that all the input features are padded to max_length?