I have been fine tuning a BERT model for sentence classification. In training, while tokenization I had passed these parameters
padding="max_length", truncation=True, max_length=150 but while inferencing it is still predicting even if
padding="max_length" parameter is not being passed.
Surprisingly, predictions are same in both the cases when
padding="max_length" is passed or not but if
padding="max_length" is not being passed, inferencing is much faster.
So, I need some clarity on the parameter “padding” in Bert Tokenizer. Can someone help me to understand how bert is able to predict even without the padding since the length of the sentences will differ and does it have any negative consequences If
padding="max_length" is not passed while inferencing? Any help would be highly appreciated.