Completely different results for model in pipeline and by itself

Hi, I’m doing this Question Answering Project
I get completely different results when I put the model in the pipeline.

pipe = QuestionAnsweringPipeline(model=model, tokenizer=tokenizer)
for index, row in test.iterrows():
    question = row.question
    text_id = row.id
    context = row.context
    inputs, offset_mapping = preprocess(question, context, text_id)
    outputs = model(**inputs)
    results_model = get_predictions(outputs, inputs, offset_mapping, context)
    results_pipe = pipe(question = question, context = context, top_k = 2)

results_model is:

[{'answer': 'During a group project, have you ever asked a group member about adding or replacing something? Or, when you were studying for a math test, did you ever ask your parents or sibling about different ways to tackle a certain problem?',
  'start': tensor(0),
  'end': tensor(230),
  'score': 0.6750812530517578},
 {'answer': '',
  'start': tensor(0),
  'end': tensor(0),
  'score': 0.9995814561843872}]

results_pipe is

[{'score': 0.0671093538403511, 'start': 2711, 'end': 2716, 'answer': 'Since'},
 {'score': 0.014637975953519344, 'start': 2879, 'end': 2881, 'answer': 'If'}]

The discrepancy is the same where model is in train or eval mode.

Why could this be?
I’m using pytorch backend.

Also, side question if I may, I want to use Pipeline to speed up inference.
Is there another way?

Ah, actually, I think I see the issue, right after posting it.
I think it’s about setting up the tokenizer? Could that be it?
In preprocess I make sure to have overlap, pad and max length = model.max_length
I’m guessing by default when I just pass tokenizer to the pipeline, none of these get set.

Can I pass a pre setup tokenizer?
Figured it out, there are two parameters, max_seq_len and max_answer_len that can set during the call but not initialization.

thanks a lot!