Fine-Tune Whisper Tensor size mismatch

Hello! I’m trying to following this blog in order to fine-tune Whisper on my data set. While training, I’m getting this error

Although during preparing my data I filtered the labels length, as @sanchit-gandhi suggested, to be less than the max_lenght of the model (448) but still getting the same error :face_with_diagonal_mouth:
Here are the link for my colab notebook
what can I do ?

Hey @RetaSy! Sorry for the delay in getting back to you! Unfortunately I can’t access your notebook (need permissions!). Feel free to update them and ping me here, I can then take a more detailed look!

In the mean time, could you double check that the extra filter step is implemented before you instantiate the Trainer:

max_label_length = model.config.max_length

def filter_labels(labels):
    """Filter label sequences longer than max length"""
    return len(labels) < max_label_length

vectorized_datasets = vectorized_datasets.filter(filter_labels, input_columns=["labels"])

trainer = (train_dataset= vectorized_datasets["train"], ...)


1 Like

Hi @sanchit-gandhi, I got the same problem when trying to fine-tune camembert. I used the filter as you suggested and it works. However, the filter has downscaled too much of my dataset and then the model’s accuray is really bad. Do we have another way to deal with it? (As I see, my error come from: /transformers/models/camembert/, line 871, in forward). Thanks in advance for your help.

Hey @maitrang!

Welcome to the forum and thanks for opening up your first question post :hugs: Awesome to have you here!

What you can do is first increase the value of the generation max length to some arbitrarily large value (e.g. 1024):

model.config.max_length = 1024

And then perform the filtering stage. By increasing the max length, we’ll raise the filter threshold for our dataset and thus filter less of it. This will give us more data to train on. However, it will also increase the memory requirement for training as we have potentially longer sequences in our training data.

Hope that answers your question!

1 Like