How to upload big jsonl files effeciently?

Hi there,

I am uploading .jsonl files which are about 70 GB. I am using https://huggingface.co/docs/huggingface_hub/v0.14.1/guides/upload#push-files-with-git-lfs

But there is a small problem for me, once I add the files using git add ., I have the same size of the files added to the disk, as I know it is related to LSF and it should be like this.

Is there any solution to mitigate the disk issue? upload the files one by one? or any other tricks?

You can use the Hub python library (which has a CLI) to upload the files.

See the docs here: Upload files to the Hub

1 Like

Thank you, I have used https://huggingface.co/docs/huggingface_hub/guides/upload#upload-a-folder and it worked perfectly.