I have about a hundred contexts that look like this:
Sir John DOE lives on 55 Main Street NY, born on august 5th 1995 in Buffalo, he is an american citizen.
and these are the questions I want to fine tune the model for:
Where does John DOE live? (answer: 55 Main Street NY)
Where was John DOE born? (answer: Buffalo)
When was John DOE born? (answer: august 5th 1995)
What is John DOE’s nationality? (answer: american)
Would having the name of the person in the training question hurt the model’s accuracy or is it ok? In production, the context will always look like this (name, birth place, birth date, nationality, address) but the name of the person will always be different, and the question will have the name of the person in it because I plan to ask 2 questions:
Question_1: what is the name of the person?
Question_2: “where does” + question_1[“answer”] + “live?”
So should I train the model with the specific name in the question or replace it with something more generic like “where does the person live”?