I am using load_dataset to load my data which is basically files stored in a directory:
I load the data with:
dataset = load_dataset(path='/Users/petar/Documents/bert/data', split='train')
Basically, only “train” split is available and my data is stored there. I also would like to get a eval or test split so I can do “per_device_eval_batch_size” in TrainingArguments. How can I make that split and specify split size?