Extractive Q&A - HF pipeline top_k returns same span as different answers

eelang · September 6, 2023, 2:07pm

I’m using the HF pipeline for extractive Q&A with top_k=K (K is some integer greater than 1). I measure the goodness of answers with the perplexity score. I noticed that sometimes I get the same span on the as different answers. (Example 1st answer with score 0.35 and the 3rd answer with score 0.11)
This is corrupting my perplexity scores which I use downstream to select the best question to extract what I need from the data.
Has anyone come across this and if so, how to solve this (other than manually grouping the results)?

Topic		Replies	Views
Question Answering Prediction without answear Intermediate	0	368	December 31, 2022
How to do multi-span question answering? Beginners	1	2979	March 31, 2022
Completely different results for model in pipeline and by itself Beginners	2	1629	February 23, 2024
Optimising performance non-standard systems 🤗Transformers	2	778	February 16, 2022
Fixed output length "summarization"/"question-answering" Intermediate	0	405	October 6, 2022

Extractive Q&A - HF pipeline top_k returns same span as different answers

Related topics