Does huggingface support load raw text dataset from hdfs?

I noticed that both load_from_disk and save_to_disk support the ‘fs’ parameters, but not load_dataset.

I want to know if it is possible to load text data from hdfs via load_dataset

Any suggestion will be appreciately.

Hi ! Currently load_dataset only supports reading files from local paths or HTTP. We may add support for other filesystems (hdfs, cloud storages…) later :slight_smile:

1 Like

Thanks for your reply, another question is Does load_from_disk support stream mode?

No it doesn’t, only lozd_dataset supports streaming mode