For example, we have some strings with person descriptions and address in it, and 2 questions:
-
What is the highest rank person name?
; -
What city?
.
The first question is hard, need a labeled dataset with a decent size, for the second question it’s almost always good from the box, just a few samples to fine-tune.
So, the question is what’s the best way to finetune Question-Answer model for this case?
Possible ways:
- For each context from first question labeled dataset extract additionally city and create Squad-like dataset, train on dataset with both questions to same context.
- Fine-tune in series - first with first question dataset, then with few samples for the second question.
Which one is better?