Trainer RuntimeError: The size of tensor a (462) must match the size of tensor b (448) at non-singleton dimension 1

navissivan · November 15, 2022, 8:42pm

So I checked Whisper feature extractor and Whisper tokenizer.

I assume the problem here is: there are 7 samples that have input lengths exceeding 30s, and 10 samples that have label lengths exceeding the max length for the model, which is 448.
So I tried:

MAX_DURATION_IN_SECONDS = 30.0
max_input_length = MAX_DURATION_IN_SECONDS * 16000

    # compute log-Mel input features from input audio array 
    batch["input_features"] = feature_extractor(audio["array"], 
                                                sampling_rate=audio["sampling_rate"],
                                                max_length=max_input_length).input_features[0]

    # encode target text to label ids 
    batch["labels"] = tokenizer(batch["raw_transcription"], 
                                truncation=True,
                                max_length=448).input_ids

While I guess the truncation in the feature extractor doesn’t matter (?) since the output feature size will be fixed to 80, this works for me to proceed with the training. Please correct me if my understanding is wrong!

Topic		Replies	Views
The size of tensor error while fine tuning whisper Beginners	1	560	February 13, 2024
I am following a hugging face guide for fine tuning whisper but I run into error when training 🤗Transformers	0	171	March 15, 2024
The size of tensor a (882) must match the size of tensor b (568) at non-singleton dimension 1 🤗Transformers	2	94	March 27, 2025
RuntimeError: The expanded size of the tensor (31) must match the existing size (7) at non-singleton dimension 0. Target sizes: [31]. Tensor sizes: [7] Beginners	0	185	May 23, 2024
Whisper Inference RuntimeError: The expanded size of the tensor (3000) must match the existing size (3392) at non-singleton dimension 1. Target sizes: [80, 3000]. Tensor sizes: [80, 3392] 🤗Transformers	1	730	May 13, 2024

Trainer RuntimeError: The size of tensor a (462) must match the size of tensor b (448) at non-singleton dimension 1

Related topics