I’m trying to load .hf datasets using stream
The dataset is tuan124816/newcs2_data
dataset = load_dataset("tuan124816/newcs2_data",
streaming=True)
hf_dataset = dataset['test']
output
IterableDataset({
features: Unknown,
n_shards: 600
})
when print out the first element:
print(next(iter(hf_dataset)))
output
{'_data_files': [{'filename': 'data-00000-of-00001.arrow'}], '_fingerprint': '905978a8bab44335', '_format_columns': ['observations', 'actions', 'rewards'], '_format_kwargs': {}, '_format_type': None, '_output_all_columns': False, '_split': None}
Is this the right way to load this kind of dataset?
How can I read the data and know what inside [‘observations’, ‘actions’, ‘rewards’]?