Bert strugling with Padded sentence

Jeydev · August 24, 2021, 12:42pm

I am Playing around with Bert Pretrained Models (bert-large-uncased-whole-word-masking) I used Huggingface to try it I first Used this Piece of Code

m = TFBertLMHeadModel.from_pretrained("bert-large-cased-whole-word-masking")
logits = m(tokenizer("hello world [MASK] like it",return_tensors="tf")["input_ids"]).logits

I then used Argmax to get max probabilities after applying softmax, Things works fine Until now.

When I used padding with max_length = 100 The model started making false prediction and not working well and all predicted tokens were the same i.e 119-Token ID

Code I used for Argmax

tf.argmax(tf.keras.activations.softmax(m(tokenizer("hello world [MASK] like it",return_tensors="tf",max_length=,padding="max_length")["input_ids"]).logits)[0],axis=-1)

Output Before using padding

<tf.Tensor: shape=(7,), dtype=int64, numpy=array([ 9800, 19082,  1362,   146,  1176,  1122,   119])>

Output After using padding with max_length of 100

<tf.Tensor: shape=(100,), dtype=int64, numpy=
array([119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119,
       119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119,
       119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119,
       119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119,
       119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119,
       119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119,
       119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119, 119,
       119, 119, 119, 119, 119, 119, 119, 119, 119])>

I wonder if this problem prevail even training a new model as It is mandatory to set Input shape for training new model I Padded and tokenized the data but, now I want to know if this problem continues with it too.

Topic		Replies	Views
Question about Bert padding part when calcualting similarity matrix Beginners	2	688	May 13, 2022
How padding in huggingface tokenizer works? 🤗Tokenizers	4	6780	November 22, 2021
Bert output for padding tokens Beginners	3	3289	February 22, 2023
Need clarity on "padding" parameter in Bert Tokenizer 🤗Tokenizers	0	486	December 8, 2022
Why does padding = 'max_length' cause much slower model inference? Models	1	621	June 8, 2023

Bert strugling with Padded sentence

Related topics