How to prepare and upload large csv dataset using git lfs or push_to_hub

I’m planning to upload around 50GB of CSV files to my huggingface dataset and I wonder what’s the proper to push them?
Should we use push_to_hub, or git lfs? and what’s the proper way to process the csv files before uploading?

Hi! You’ll probably get better performance (faster upload) by using git lfs. push_to_hub stores data in the compressed parquet format, which can save a lot of bandwidth, but doesn’t use a git-based workflow (currently) resulting in slower upload speeds in most situations.