Returning Multiple Answers for a QA Model on SageMaker

Hi,

I’ve currently fine-tuned this Question-Answering model to fit a specific business use case we have (identifying the name of a company from a piece of text). When it comes to inference, I’ve found as @sgugger has very clearly explained in this notebook that sometimes the best answer isn’t the one with the best start and end logits as sometimes the highest scoring combination can produce an answer that is too long or too short (just one character).

As such when I was predicting using this model locally I created a return_best_combinaton function that finds the most practical answer using the list of logit scores.

When I used this model using the SageMaker API I realised it just returns one single answer with a score assigned to it. I wanted to check how this answer is produced? (happy to just be directed to the source code if it’s available) and wether it’s possible to return n number of likely answers instead of just 1.

Thanks every so much,
Karim

The Inference Toolkit uses the transformers pipelines under the hood. So if your are deploying a model for Question-Answering it would use the pipeline("question-answering"). You can find the code for this here: transformers/question_answering.py at master · huggingface/transformers · GitHub

But if you want to use your own function return_best_combinaton you could create a custom inference.py with your own “prediction” step.

2 Likes

Understood that’s very helpful thank you. Looking at the code it seems that max_answer_len, handle_impossible_answer & topk all get me what I’m looking for so that’s perfect! No need for my own inference.py script. Thank you!

1 Like