Hi HF community! I have .jsonl dataset which has every line just like:
{‘gt_parse’: {
‘company’: ‘KUAFOR’,
‘vergiDairesiNO’: ‘32969258048’,
‘tarih’: ‘24-07-2023’,
‘fisNO’: ‘FİS NO:4’,
‘toplamKDV’: ‘KDV *0,00’,
‘toplamTutar’: ‘TOP *550,00’,
‘items’: [{‘kdvUrunu’: ‘ROFLE’, ‘kdvYuzdesi’: '% ', ‘kdvTutar’: ‘*550,00’},
{‘kdvUrunu’: ‘OJE’, ‘kdvYuzdesi’: ‘%1’, ‘kdvTutar’: ‘*120,00’} ]},
‘meta’: {‘image_id’: 181, ‘image_size’: {‘width’: 430, ‘height’: 1591}, ‘split’: ‘train’}}
I am developing a loading script but i get an error while i am loading the dataset. I have set loading class _info method like following script, what’s wrong with this? Especially i am suspicious about ‘items’ key. How can i define this key accurately? I would be appricate any help.
def _info(self):
return datasets.DatasetInfo(
description=_DESCRIPTION,
features=datasets.Features(
{
"gt_parse": datasets.features.Sequence(
{
'company': datasets.Value("string"),
'vergiDairesiNO': datasets.Value("string"),
'tarih': datasets.Value("string"),
'fisNO': datasets.Value("string"),
'toplamKDV': datasets.Value("string"),
'toplamTutar': datasets.Value("string"),
'items': datasets.features.Sequence(
{
"kdvUrunu": datasets.Value("string"),
"kdvYuzdesi": datasets.Value("string"),
"kdvTutar": datasets.Value("string")
})
}),
"meta": {
'image_id': datasets.Value("int64"),
'image_size': datasets.features.Sequence(
{
'width': datasets.Value("int64"),
'height': datasets.Value("int64")
}),
'split': datasets.Value("string")
}
}),