CPU based Bert question answering model

What is the best question answering model with minimum CPU latency? If I want to quantize the model what are the best approaches? Any reference link ?