Multiprocessing and sharding when creating dataset from scratch using loading script

Nice …

Let me know if I can contribute to any of the stuff …