I am following this documentation here. And dataset-cli test . --save_infos --all_configs
fails with follwoing message
0 tables [00:00, ? tables/s]Failed to read file '/raid/home/xyz/hfspace/codequeries/ideal_test.json' with error <class 'pyarrow.lib.ArrowInvalid'>: JSON parse error: Column() changed from object to number in row 0
Traceback (most recent call last):
File "/raid/home/xyz/hfspace/bin/datasets-cli", line 8, in <module>
sys.exit(main())
File "/raid/home/xyz/hfspace/lib/python3.8/site-packages/datasets/commands/datasets_cli.py", line 39, in main
service.run()
File "/raid/home/xyz/hfspace/lib/python3.8/site-packages/datasets/commands/test.py", line 135, in run
builder.download_and_prepare(
File "/raid/home/xyz/hfspace/lib/python3.8/site-packages/datasets/builder.py", line 704, in download_and_prepare
self._download_and_prepare(
File "/raid/home/xyz/hfspace/lib/python3.8/site-packages/datasets/builder.py", line 793, in _download_and_prepare
self._prepare_split(split_generator, **prepare_split_kwargs)
File "/raid/home/xyz/hfspace/lib/python3.8/site-packages/datasets/builder.py", line 1268, in _prepare_split
for key, table in logging.tqdm(
File "/raid/home/xyz/hfspace/lib/python3.8/site-packages/tqdm/std.py", line 1195, in __iter__
for obj in iterable:
File "/raid/home/xyz/hfspace/lib/python3.8/site-packages/datasets/packaged_modules/json/json.py", line 133, in _generate_tables
raise ValueError(
ValueError: Not able to read records in the JSON file at /raid/home/xyz/hfspace/codequeries/ideal_test.json. You should probably indicate the field of the JSON file containing your records. This JSON file contain the following fields: ['examples']. Select the correct one and provide it as `field='XXX'` to the dataset loading method.
I am trying to read a json files which has follwoing structure
- examples
- question
- context
My _generate_examples
is the following -
def _generate_examples(self, filepath, split):
assert split == datasets.Split.TEST
with open(filepath, "rb") as f:
cq_data = json.load(f)
key = 0
for row in cq_data["examples"]:
instance_key = key + "_" + row["question"]
yield instance_key, {
"question": row["question"],
"context": row["context"],
}
Can you please help where the error is coming from?