ValueError: Expected input batch_size (16) to match target batch_size (64)

I’ve modeled my training script on the information in the finetuning with custom datasets documentation (https://huggingface.co/transformers/custom_datasets.html).
I have both a custom dataset and a custom model (I used the run_language_modeling.py script to pretrain the roberta-base model with our raw texts).

when I run trainer.train() I get the error: ValueError: Expected input batch_size (16) to match target batch_size (64), when the model is computing the loss on a training_step

I don’t know where target batch_size is being set. The input batch_size matches the value I have for per_device_train_batch_size.

Does anyone have an idea?

1 Like

Tried this using roberta-base as the model as well, and get the same error.

Hey Laurb,

Sorry I know this is an old post, but did you manage to resolve this? I’ve got the same issue when using DistilBert using custom dataset as in their tutorial. :frowning:
ValueError: Expected input batch_size (16) to match target batch_size (2848).

Sorry. I did resolve it, but have no memory of how. I’m up to using transformers 4.9.2 now and do not have the issue and do not need to make changes to their run_classification script. (I’m using the pytorch version).

1 Like

Hello Rainiefantasy,

I know this is an old issue but have you also managed to resolve this problem? Maybe you remember. I have the exact same problem…

Thank you!

Hello,

I am having similar issue while running trainer.train():

ValueError: Expected input batch_size (664) to match target batch_size (8).

Checkpoint: bert-base-uncased
Dataset: jmamou/augmented-glue-sst2

Can anyone please help?
Thanks,

1 Like

In my case it was because I was trying to train as a multi-label classification model by encoding the labels with sklearn’s MultiLabelBinarizer but forgot to set the config parameter for multilabel setting to the model:

model.config.problem_type = “multi_label_classification”

Hi,

I’m also facing same issue while running trainer.train()

Using bert-base-uncased

Basically I’ve done the sliding windows for splitting the dataset as we have the maximum sequence length for the tokenizer.

Ideally sliding windows are generated and we have mapped to respective labels.

While the trainer.train() I’m getting the below error,

ValueError: Expected input batch_size (2040) to match target batch_size (6392)

Can anyone please help with the error?

Thanks