Shouldn’t this work?
dataset = load_dataset('json', data_files='path/to/file')
dataset.train_test_split(test_size=0.15)
I’m getting this following error:
Using custom data configuration default
Downloading and preparing dataset json/default-cf892ee5bc3fc36a (download: Unknown size, generated: Unknown size, post-processed: Unknown size, total: Unknown size) to /root/.cache/huggingface/datasets/json/default-cf892ee5bc3fc36a/0.0.0/70d89ed4db1394f028c651589fcab6d6b28dddcabbe39d3b21b4d41f9a708514...
Dataset json downloaded and prepared to /root/.cache/huggingface/datasets/json/default-cf892ee5bc3fc36a/0.0.0/70d89ed4db1394f028c651589fcab6d6b28dddcabbe39d3b21b4d41f9a708514. Subsequent calls will reuse this data.
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-3-59d55201b8c3> in <module>()
1 dataset = load_dataset('json', data_files='/path/to/file')
----> 2 dataset.train_test_split(test_size=0.15)
3 dataset.shard(10)
4 dataset
AttributeError: 'DatasetDict' object has no attribute 'train_test_split'