Label 2 id not working

I recently starting to make a text classification pipeline using this tutorial:

I converted my own data to a dataset with 1 col called text and 1 col called label.

I then did what the tutorial said but I got error:


Unable to create tensor, you should probably activate truncation and/or padding with ‘padding=True’ ‘truncation=True’ to have batched tensors with the same length. Perhaps your features (label in this case) have excessive nesting (inputs type list where type int is expected).


I thought id2label and label2id parameter in the model would take care of this. But it didn’t so I added a line in my batch tokenizer to convert the labels to int.

now my tokenized dataset has 1 col called text, 1 col called label, 1 col called input_ids, 1 col called attention_masks and 1 more column.

My question is what columns does the Trainer use to train and validate the text classification pipeline? should I remove all other cols from my tokenized dataset?