ValueError: Expected input batch_size (4096) to match target batch_size (8)

/usr/local/lib/python3.7/dist-packages/transformers/optimization.py:309: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
  FutureWarning,
***** Running training *****
  Num examples = 1000
  Num Epochs = 5
  Instantaneous batch size per device = 8
  Total train batch size (w. parallel, distributed & accumulation) = 8
  Gradient Accumulation steps = 1
  Total optimization steps = 625
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-21-3435b262f1ae> in <module>()
----> 1 trainer.train()

7 frames
/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction, label_smoothing)
   2844     if size_average is not None or reduce is not None:
   2845         reduction = _Reduction.legacy_get_string(size_average, reduce)
-> 2846     return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
   2847 
   2848 

ValueError: Expected input batch_size (4096) to match target batch_size (8).

attempting to fine-tune a model, getting the error when running trainer.train(). 2nd, just to test if I can functionally run this before cutting up a proper train and eval dataset, I’m using the same smaller dataset in the Trainer.

how do I process the dataset to fit the smaller batch size?

from transformers import TrainingArguments

training_args = TrainingArguments(
    output_dir="./results",
    learning_rate=2e-5,
    per_device_train_batch_size=4096,
    per_device_eval_batch_size=4096,
    num_train_epochs=5,
    weight_decay=0.01,
    evaluation_strategy="epoch",
)

from transformers import Trainer

trainer = Trainer(model=model, args=training_args, train_dataset=small_train_dataset, eval_dataset=small_train_dataset)