How to build an extractive question answering model without knowing the start index of the answer?

Hi there.

I’m trying to look for a model that can help me extract information from an unstructured body of text. I stumbled upon the extractive QA guide under Hugging Face’s NLP course, but in the demo (fine-tuning a BERT model on the SQuAD dataset) they require me to know the start index of the answer. The problem is, though I know the answer is there in the text, I don’t know where it is. Also my dataset is moderately big, so it’ll take a lot of time for me to do manual searches for all the training texts.

Is there any way to build a model without me having to use the start/end index of the answer?

Thank you!