In Python, map works as follows:
map(func, arg_1, arg_2)
In datasets.map
, we are required to pass in a callable (which expects objects of form dataset[idx]
, which means that certain things like tokenizer have to be defined and should be accessible within the scope of this function, along with that other parameters that we want to pass. Can we pass arguments like a normal func call as shown above ? I’m asking because I have two preprocess_func
for train
and validation
split, and I have to write both functions twice which looks repetitive (since there are slight changes in both).