I’ve currently fine-tuned this Question-Answering model to fit a specific business use case we have (identifying the name of a company from a piece of text). When it comes to inference, I’ve found as @sgugger has very clearly explained in this notebook that sometimes the best answer isn’t the one with the best start and end logits as sometimes the highest scoring combination can produce an answer that is too long or too short (just one character).
As such when I was predicting using this model locally I created a return_best_combinaton function that finds the most practical answer using the list of logit scores.
When I used this model using the SageMaker API I realised it just returns one single answer with a score assigned to it. I wanted to check how this answer is produced? (happy to just be directed to the source code if it’s available) and wether it’s possible to return n number of likely answers instead of just 1.
Understood that’s very helpful thank you. Looking at the code it seems that max_answer_len, handle_impossible_answer & topk all get me what I’m looking for so that’s perfect! No need for my own inference.py script. Thank you!
Can you please elaborate how you fixed it? Would be really helpful as I am dealing with the same problem right now.
This is the model I am trying to get the multiple inferences from, for which Sagemaker is returning just 1.
hub = {
‘HF_MODEL_ID’:‘valhalla/t5-base-qa-qg-hl’,
‘HF_TASK’:‘text2text-generation’
}
Hey @casafurix! You can use the top_k parameter to get more than one answer. If you set it to 5 for example you’ll get the top 5 answers ranked by scores.