Duplicated cache- arrow files when uploading large folder?

Hello,

I have a folder with .arrow files that I previously created using save_to_disk. When I now use upload_large_folder I see some .cache- files being pushed to the hub, that I donā€™t see in the directory iā€™m pushing in my local machine. Is this normal? Are they duplicates or is HF splitting the file into two?

It seems that all files from my local folder were uploaded and these are additional

This is an example of the ā€œextraā€ files:

[ā€˜uniref50_202401/arrow/train/cache-5438b1d15cbf9f5a_00004_of_00024.arrowā€™, ā€˜uniref50_202401/arrow/train/cache-77dd2d54eba47e69_00004_of_00024.arrowā€™]

Can i just delete those in the repo?

1 Like

It seems like itā€™s okay to delete it, but if youā€™re worried, call lhonestq.