Wav2vec2 Acces Feature Layers Performance

codexseraph · February 4, 2023, 11:41am

Hey All,

I am rather new with Wav2Vec 2.0 working with it since december for a speech emotion recognition project. I started with torchaudio.pipelines bundle of Wav2vec 2 base model and put a Classification Head on top of it and freeze the transformers CNN-Layers, but fine-tune the rest on my emotion recognition task. I could work with the last layer (12) output, but I’d like to rather acces output of layer 10.
I did this with model.extract_features(). How to use the model without storing each layers Output? I have a batch of 32 of length 400000… Therefore ( the reason I suspect) I can’t make it run on a GPU (Feature Layers leading to 12 x 32 x 1400 x 768 tensor storage). There is the argument num-layers, which outputs only the layers output up to a number (why only int instead of list??). I manipulated components.py to only return the layer I want ( nice! I am able to run batches of 32 instead of previous maximum of 25 on my local CPU). I also tried the transformer method and used model(x).last_hidden_state inside my Classification head model… With everything I tried I still get Memory overload of more than 80GB on gpu. I guess I am not using the right method for what I want to achieve… please help me!

marccasalssalvador · May 7, 2025, 7:31am

Hi @codexseraph! Can you provide us the code?

Topic		Replies	Views
Train phoneme recognizer using Wav2Vec2 intermediate features Beginners	0	504	November 1, 2022
Getting embeddings from wav2vec2 models Beginners	2	1421	October 20, 2023
Get last embedding layer from wav2vec Beginners	0	131	February 22, 2024
Wav2vec2 using transformers library Beginners	0	278	November 18, 2021
Access Quantization module in wave2vec2 🤗Transformers	0	255	April 5, 2022

Wav2vec2 Acces Feature Layers Performance

Related topics