Remove columns before training

cravephi · January 4, 2023, 7:36am

Hi,

I see in many examples that we remove unnecessary columns from the dataset, before training.
Is it mandatory? I guess Trainer will take the columns produced by the tokenizer anyway (together with the “labels” columns).
Is it to save some IOs? I guess transformer does not load unnecessary columns, but not sure then, why in examples, we remove them.

Sorry, very beginner question here

Topic		Replies	Views
RemoveColumnsCollator is removing all columns 🤗Transformers	4	722	August 12, 2024
Column names of custom dataset for use with trainer Beginners	3	5457	March 31, 2024
How can I fine tune with my own dataset? 🤗Transformers	0	375	May 3, 2022
The following columns in the training set don't have a corresponding argument 🤗Transformers	2	8041	October 5, 2024
Remove columns from streamable datasets doesn't work 🤗Datasets	3	6178	January 24, 2024

Remove columns before training

Related topics