How to use pipelines for question-answering

I’m trying to use pipelines to do extractive question-answering using a model trained on SQUAD. I can’t figure out how to pass data to the pipeline in the right way. I actually think there might be a bug in the question-answering pipeline.

Here’s what I tried:

context = "There are three major types of rock: igneous, sedimentary, and metamorphic. The rock cycle is an important concept in geology which illustrates the relationships between these three types of rock, and magma. When a rock crystallizes from melt (magma and/or lava), it is an igneous rock. This rock can be weathered and eroded, and then redeposited and lithified into a sedimentary rock, or be turned into a metamorphic rock due to heat and pressure that change the mineral content of the rock which gives it a characteristic fabric. The sedimentary rock can then be subsequently turned into a metamorphic rock due to heat and pressure and is then weathered, eroded, deposited, and lithified, ultimately becoming a sedimentary rock. Sedimentary rock may also be re-eroded and redeposited, and metamorphic rock may also undergo additional metamorphism. All three types of rocks may be re-melted; when this happens, a new magma is formed, from which an igneous rock may once again crystallize."
question = "What are the three major types of rock?"
qa_pipe = pipeline("question-answering", model="distilbert-base-cased-distilled-squad", tokenizer="bert-base-cased")
qa_pipe({ "context": context, "question": question })

And here’s the error:

TypeError                                 Traceback (most recent call last)
<ipython-input-21-e7bacfef2b9c> in <module>()
      8 question = "What are the three major types of rock?"
      9 qa_pipe = pipeline("question-answering", model="distilbert-base-cased-distilled-squad", tokenizer="bert-base-cased")
---> 10 qa_pipe({"context": context, "question": question })

5 frames
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

TypeError: forward() got an unexpected keyword argument 'token_type_ids'

Well, I removed the tokenizer=* arg and it worked. (I had copied that from the docs pipeline example though.)

context = "There are three major types of rock: igneous, sedimentary, and metamorphic. The rock cycle is an important concept in geology which illustrates the relationships between these three types of rock, and magma. When a rock crystallizes from melt (magma and/or lava), it is an igneous rock. This rock can be weathered and eroded, and then redeposited and lithified into a sedimentary rock, or be turned into a metamorphic rock due to heat and pressure that change the mineral content of the rock which gives it a characteristic fabric. The sedimentary rock can then be subsequently turned into a metamorphic rock due to heat and pressure and is then weathered, eroded, deposited, and lithified, ultimately becoming a sedimentary rock. Sedimentary rock may also be re-eroded and redeposited, and metamorphic rock may also undergo additional metamorphism. All three types of rocks may be re-melted; when this happens, a new magma is formed, from which an igneous rock may once again crystallize."
question = "What are the three major types of rock?"
qa_pipe = pipeline("question-answering")
qa_pipe({ "context": context, "question": question })
Downloading: 100%
29.0/29.0 [00:00<00:00, 369B/s]
Downloading: 100%
208k/208k [00:00<00:00, 2.49MB/s]
Downloading: 100%
426k/426k [00:00<00:00, 5.51MB/s]
{'answer': 'igneous, sedimentary, and metamorphic',
 'end': 74,
 'score': 0.94291090965271,
 'start': 37}