Answer column not dictionary it is string when load csv using load_dataset

I have two CSV file for train and validation. When I import those csv using following code, it give string type not dictionary type data for answer.

from datasets import load_dataset

datafiles = {"train":"/content/tain.csv","validation":"/content/validation.csv"}

raw_datasets = load_dataset('csv',data_files=datafiles)

Output:

DatasetDict({
    train: Dataset({
        features: ['id', 'context', 'question', 'answers'],
        num_rows: 5834
    })
    validation: Dataset({
        features: ['id', 'context', 'question', 'answers'],
        num_rows: 1459
    })
})

when I access answers, it is string not dictionary.

raw_datasets["train"][0]['answers']

'{'text': ['Vimy Ridge'], 'answer_start': [51]}'

Can not access text. raw_datasets["train"][0]['answers']['text'] give error.

Hi ! CSV data are not nested: each field can be a string or a number but not nested data.

You can try loading JSON files instead or decoding the nested data using json.loads in map:

datasets = raw_datasets.map(lambda x: {"answers": json.loads(x["answers"])})