Currently, I can use dataset.save_to_disk(“s3://…”) to directly save to the s3 buckets as arrow files. But how to save it as a parquet file?
to_parquet method fails to save directly to the s3 bucket.
Currently, I can use dataset.save_to_disk(“s3://…”) to directly save to the s3 buckets as arrow files. But how to save it as a parquet file?
to_parquet method fails to save directly to the s3 bucket.
Currently, the only option is to save them locally and then upload them to a S3 bucket.
I opened an issue as this would be useful to support: Support `fsspec` in `Dataset.to_<format>` methods · Issue #6086 · huggingface/datasets · GitHub.
yeah,
What I do is, use to_parque and then boto3.upload.