Getting hidden states from the "automatic-speech-recognition" pipeline

LandryB · July 15, 2022, 5:59pm

Hi - I’ve found success in my attempts to transcribe larger audio clips using the pipeline class. However, I’d also like to get the hidden states (outputs of last layer) of all possible timepoints of these audio files. I’ve tried setting return_hidden_state=True when I initialize the pipeline object, but this does not affect the output. How else could I retrieve the hidden states for long audio files using pipeline class?

from transformers import pipeline
import soundfile as sf

filename = 'test.wav'
audio_input, sample_rate = sf.read(filename)

pipe = pipeline(model="facebook/wav2vec2-base-960h", return_hidden_states=True)

out = pipe(audio_input, chunk_length_s=10, stride_length_s=2, return_hidden_states=True, return_timestamps="word")

The out only contains:

 'chunks': [{'text': 'AND', 'timestamp': (0.34, 0.4)},
  {'text': 'THEN', 'timestamp': (0.46, 0.58)},
  {'text': 'NOW', 'timestamp': (1.54, 1.7)},
  {'text': 'WERE', 'timestamp': (1.78, 1.92)},
  {'text': 'RECORDING', 'timestamp': (1.96, 2.32)},
  {'text': 'AN', 'timestamp': (2.38, 2.42)},
  {'text': 'IOU', 'timestamp': (2.52, 2.62)},
  {'text': 'ALL', 'timestamp': (3.1, 3.2)},
  {'text': 'RIGHT', 'timestamp': (3.24, 3.38)},
  {'text': 'SOIXSIDE', 'timestamp': (5.18, 5.72)},
  {'text': 'EGG', 'timestamp': (5.78, 5.86)},
  {'text': 'AM', 'timestamp': (6.88, 7.02)},
  {'text': 'I', 'timestamp': (8.98, 9.0)},
  {'text': 'WAS', 'timestamp': (9.06, 9.12)},
  {'text': 'EVERYBODY', 'timestamp': (9.18, 9.48)}]}

Topic		Replies	Views
How to get hidden states when using custom Pipeline? Beginners	3	2481	January 3, 2023
Extracting output speech recognition features while chunking Models	0	281	July 14, 2022
Whisper pipeline return_timestamps error Beginners	0	1523	March 4, 2023
Get last embedding layer from wav2vec Beginners	0	131	February 22, 2024
Hidden states embedding tensors 🤗Transformers	5	4008	July 22, 2023

Getting hidden states from the "automatic-speech-recognition" pipeline

Related topics