Question about Wav2vec2

ahmedlone123 · May 6, 2022, 4:16pm

I wanted to know what’s up with Wav2vec2 facebook/wav2vec2-base-960h · Hugging Face and
facebook/wav2vec2-base · Hugging Face I was looking into the model files and saw that a padding value had been specified but when I use the preprocessor it seems like it totally ignores the instructions in the preprocessor config and passing values to the processor call does not help either. I want to know where does the processor get the padding value if not from the processor config

My code :

 from transformers import Wav2Vec2Processor, Wav2Vec2ForCTC
 from datasets import load_dataset
 import soundfile as sf
 import torch
 
# load model and tokenizer
processor = Wav2Vec2Processor.from_pretrained("facebook/wav2vec2-base")

input_values3 =  processor(x, return_tensors="pt", padding="longest").input_values  # Batch size 1

input_values3

Also, another question does each Wav2vec2 model have its own way of normalizing/processing audio because I passed the same audio to jonatasgrosman/wav2vec2-large-xlsr-53-chinese-zh-cn · Hugging Face

and it gave me different results,my code :

 from transformers import Wav2Vec2Processor, Wav2Vec2ForCTC
 from datasets import load_dataset
 import soundfile as sf
 import torch
 
# load model and tokenizer
processor = Wav2Vec2Processor.from_pretrained("jonatasgrosman/wav2vec2-large-xlsr-53-chinese-zh-cn")

input_values =  processor(x, return_tensors="pt", padding="longest").input_values  # Batch size 1

input_values

ahmedlone123 · May 6, 2022, 9:48pm

I was playing around with Wav2vec2 a bit more and there were a few more things I noticed that were off, firstly in the Wav2vec2Procesor whenever we pad, we then apply the normalization which in turn changes the padded value per sequence. Would it not make more sense to let the padded value be as is and apply the normalization before the padding? Also, Some parameters like padded_value and do_normalize do not work either. I tried saying do_normalize = False but it still normalized. In the same case with padded_value, I tried passing a few different values but the output from the processor did not change one bit. I think this is because we do not pass the padded_value to the pad function

Shouldn’t we pass the other parameters here as well?

Let me know where I am wrong, I am not very experienced with ML/DL models yet so all of this might be just my misunderstanding

Topic		Replies	Views
Processor :: pad Ignores Padding? Beginners	1	768	November 22, 2023
Wav2Vec2ForCTC not working for my own wav file 🤗Transformers	0	872	November 22, 2021
Batch input for wav2vec2 pretraining Beginners	1	370	July 15, 2021
What does Wav2Vec2Tokenizer do?and what is the difference between it and Wav2Vec2FeatureExtractor? Beginners	0	298	May 12, 2023
Why I'm getting same result with or without using Wav2Vec2Processor? 🤗Tokenizers	0	326	February 25, 2023

Question about Wav2vec2

Related topics