Do we need to fine-tune Wav2Vec2FeatureExtractor?

I’m thinking about training wav2vec2 model for Japanese.
And I have a question.
Do we need Wav2Vec2FeatureExtractor as well?
Or can we use Wav2Vec2FeatureExtractor for any languages?

Thanks in advacne.

[Found answer]It seems there are 2 kinds of feature extractors.
1st one is just normalize raw audio and 2nd one is part of architecture.
Since 1st one is just normalizing raw audio, I don’t think we need to train it.

1 Like