Hugging Face Forums
Multiprocessing and sharding when creating dataset from scratch using loading script
🤗Datasets
parano
November 4, 2022, 3:17pm
3
Nice …
Let me know if I can contribute to any of the stuff …
show post in topic
Related topics
Topic
Replies
Views
Activity
Loading multiple serialized datasets with `multiprocessing`
🤗Datasets
2
621
April 2, 2022
Streaming and creating refactored dataset with shards using Generator
🤗Datasets
4
281
October 30, 2024
How does `datasets.Dataset.map` parallelize data?
Beginners
3
3163
August 5, 2024
datasets.Dataset.map() idle processes when multiprocessing
🤗Datasets
6
2861
December 22, 2022
Ideal batch_size and writer_batch_size for datasets
🤗Datasets
1
1677
December 9, 2022