How to ensure tensor size will be dimensions (batch_size, 512)?

TL;DR I’m hitting an intermittent error about my tensor dimensions when running predictions on my fine-tuned model.

Full story:
I followed this tutorial to fine-tune distilbert to a text classification dataset. I encounter the following error intermittently when running predictions against my fine-tuned model. Here is the predict code:

dataset = Dataset.from_pandas(X)
self.model = TFAutoModelForSequenceClassification.from_pretrained('my_model')

tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")

def new_preprocess_function(examples):
      return tokenizer(examples["text"], truncation=True)

tokenized_dict =, batched=True)

data_collator = DataCollatorWithPadding(tokenizer=tokenizer, return_tensors="tf")

tf_predict_set = tokenized_dict.to_tf_dataset(
                columns=["attention_mask", "input_ids", "label"],


And the error is:

Graph execution error:

Shape of tensor args_0 [16,449] is not compatible with expected shape [?,512].
	 [[{{node EnsureShape_1}}]]
	 [[IteratorGetNext]] [Op:__inference_predict_function_52036]

Why does tensorflow expect 512? Aren’t tensor dimensions (batch_size, max_len_of_sentence_in_the_batch)? I know BERT cannot accept input sentences longer than 512 tokens, but this is fewer than 512.

How can I ensure my tensor dimensions are (batch_size, 512)?