Shape of squad data for Question answering

Before this looks too naive even for a beginner I will mention that i read all the questions possible around the “squad” dataset that’s loaded for the Question Answering example :slight_smile: so if i missed something - my apologies.
The question is simple - the current shape of the “squad” dataset looks like this -
DatasetDict({
train: Dataset({
features: [‘version’, ‘data’],
num_rows: 1
})
validation: Dataset({
features: [‘version’, ‘data’],
num_rows: 1
})
})
Whereas your old colab notebooks show them in a different structure … such as this -

DatasetDict({
train: Dataset({
features: [‘id’, ‘title’, ‘context’, ‘question’, ‘answers’],
num_rows: 87599
})
validation: Dataset({
features: [‘id’, ‘title’, ‘context’, ‘question’, ‘answers’],
num_rows: 10570
})
})

So i tried to play safe by downloading the v1.1 dataset and went up until the pre_process function successfully by doing this - on my data.
train_contexts, train_questions, train_answers = read_squad(‘squad/train.json’)
val_contexts, val_questions, val_answers = read_squad(‘squad/dev.json’)

Now my skills run out when we get to the preprocesstrainingexamples function :slight_smile: … am unable to pass the right object when we need to call the .map function to iterate through my dataset. Any help here towards either redirecting me to “forcing” v1.1 data in some ways thro this process or how i could potentially roll my current data structure into a form that can call this .map function - would be amazing. Kindly advise.

thanks.