Dataset map method - how to pass argument to the function

BramVanroy · March 31, 2022, 11:51am

Is there any downside to using either options? If I remember correctly (?) lambdas are not picklable. So my assumption would be that if you do something like

new_dataset = my_dataset.map(lambda batch: my_processing_func(batch, model, tokenizer), batched=True)

it won’t be cached. Is that correct?

Topic		Replies	Views
How to use dataset with costume function? Beginners	3	848	June 19, 2023
Can dataset.map accept multiple arguments like python map 🤗Datasets	3	5741	April 20, 2023
Pass `Dataset.map` result to model Beginners	2	1113	April 4, 2023
Dataset and Training Batching Beginners	1	1447	February 9, 2022
Huggingface Dataset.map shows red progress bar when batched=True 🤗Datasets	1	1175	October 24, 2022

Dataset map method - how to pass argument to the function

Related topics