Shape mismatch between labels and logits

Kamugg · January 28, 2023, 4:42am

Hello everyone I really hope this is the correct category for this question. I’m using TFAutoModelForSequenceClassification to perform a multi labeling task on a dataset. This dataset has a text and 20 columns, one for each class. If an example has a 1 on a column it means that it belongs to that class, an example could be:

‘It’s cold today’ 0 0 1 1 0 1 0

I loaded the Dataframe into a HF dataset and I loaded the model and tokenizer with:

tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
bert = TFAutoModelForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=num_labels, problem_type='multi_label_classification', id2label=id2labels, label2id=labels2id)

I then proceeded to convert the dataset into:

‘Tokenized text’ | ‘Array of 0s and 1s’

To do this I wrote this code:

def tokenize_and_encode(val, tokenizer, max_length):
  tokenized = tokenizer(val['Premise'], truncation=True, padding='max_length', max_length=max_length)
  labels = []
  for index in id2labels.keys():
    # Convert the columns into a single array of zeros and ones
    labels.append(val[str(index)])
  return {'input_ids': tokenized['input_ids'],
          'attention_mask': tokenized['attention_mask'],
          'labels': labels}
  

train_dataset = Dataset.from_pandas(train_df)

train_dataset = train_dataset.map(lambda x: tokenize_and_encode(x, tokenizer, 200), remove_columns=train_dataset.column_names)

I then prepared the dataset and the model for the training phase:

batch_size = 16
num_epochs = 3
batches_per_epoch = len(train_dataset) // batch_size
total_train_steps = int(batches_per_epoch * num_epochs)
optimizer, schedule = create_optimizer(init_lr=2e-5, num_warmup_steps=0, num_train_steps=total_train_steps)

tf_train_set = bert.prepare_tf_dataset(
    train_dataset,
    shuffle=True,
    batch_size=batch_size
)

bert.compile(optimizer=optimizer)

Until now I have no problems in running the code. But when I call:

history = bert.fit(tf_train_set, epochs=5)

I receive a very big error, but I think the most important part is:

ValueError: `labels.shape` must equal `logits.shape` except for the last dimension. Received: labels.shape=(320,) and logits.shape=(16, 20)
    
    
    Call arguments received by layer "tf_bert_for_sequence_classification" (type TFBertForSequenceClassification):
      • self={'input_ids': 'tf.Tensor(shape=(16, 200), dtype=int64)', 'attention_mask': 'tf.Tensor(shape=(16, 200), dtype=int64)', 'labels': 'tf.Tensor(shape=(16, 20), dtype=int64)'}
      • input_ids=None
      • attention_mask=None
      • token_type_ids=None
      • position_ids=None
      • head_mask=None
      • inputs_embeds=None
      • output_attentions=None
      • output_hidden_states=None
      • return_dict=None
      • labels=None
      • training=True

The model states that it received labels of shape 320 when I should have provided a shape of (16, 20) and the line below the error states that I indeed provided a shape of (16, 20). It’s like my labels are being flattened? I can’t understand what’s happening.

Thank you very much to all of you.

wtt2077 · December 27, 2023, 8:03am

model.compile(optimizer=optimizer, loss=‘categorical_crossentropy’)

Topic		Replies	Views
Logits and labels must have the same shape ((512, 6) vs (6, 1)) - MultiClass Classification with BERT Beginners	0	1446	September 3, 2021
Fine tune for multilabel classification, shapes mismatch Beginners	0	396	December 11, 2021
Labels shape when using model.fit and TFGPT2LMHeadModel 🤗Transformers	0	752	February 1, 2021
Target size (torch.Size([8])) must be the same as input size (torch.Size([8, 2])) 🤗Transformers	5	5497	October 13, 2023
Mismatched target and input size for BCE using "multi_label_classification" Intermediate	2	7017	September 1, 2022

Shape mismatch between labels and logits

Related topics