I am trying map() function with the following code
dataset = load_dataset('csv', data_files={'train': 'train.csv', 'validation': 'dev.csv', 'test': 'test.csv'}, column_names=['sentence1', 'sentence2', 'label'])
tokenizer = AutoTokenizer.from_pretrained('roberta-large')
def tokenize_function(samples):
return tokenizer(samples['sentence1'], samples['sentence2'], padding=True, truncation='True', max_length=256)
tokenized_datasets = dataset.map(tokenize_function, batched=True)
and the program seems to be stuck at this point.