ASTFeatureExtractor

tirengarfio · June 13, 2025, 8:52am

Hi,

I’m working in a Master’s Dissertation to predict music popularity using AST model.

I’m looking now at the ASTFeatureExtractor here: Audio Spectrogram Transformer that converts audio raw files to Mel spectrograms.

Looks like ‘max_length’ parameter of ASTFeatureExtractor default value is 1024. To me, 1024 means that only the first 10.24 seconds of each song will be inserted to the model. Anyone can confirm that?

Regards

Topic		Replies	Views
Audio Spectrogram Transformer in tensorflow 🤗Transformers	0	121	August 2, 2023
The size of tensor a (146) must match the size of tensor b (1214) at non-singleton dimension 1 🤗Transformers	0	380	November 8, 2023
The output sequence length of Whisper ASR model Models	2	1664	April 21, 2023
Speech recognition max length Beginners	2	122	October 29, 2024
Modified SpeechT5 TTS has large gaps in Spectrogram Models	0	80	May 30, 2024

ASTFeatureExtractor

Related topics