Hi,
I am looking for a way to download a large dataset, transform it and then upload it to another location. Note that the transformations for each instance is independent of others.
I can load the dataset in streaming mode and start the transformation but cannot find a way to write to huggingface hub (in batches) in as download and transformation are ongoing. Wondering if such pattern exists