How to load weights for Pre-trained model for Question Answering?

What should I change in below snippet to get consistent and accurate output?

Code Snippet:
from transformers import ElectraTokenizer, TFElectraForQuestionAnswering, ElectraConfig
import tensorflow as tf

configuration = ElectraConfig()
tokenizer = ElectraTokenizer.from_pretrained(‘google/electra-small-discriminator’)
TFElect = TFElectraForQuestionAnswering(configuration)
#model = TFElectraForQuestionAnswering.from_pretrained(‘google/electra-small-discriminator’)
model = TFElect.from_pretrained(‘google/electra-small-discriminator’)

question, text = “Who was Jim Henson?”, “Jim Henson was a nice puppet”
input_dict = tokenizer(question, text, return_tensors=‘tf’)
outputs = model(input_dict,return_dict=True)
#print(outputs)
start_logits = outputs.start_logits
end_logits = outputs.end_logits
all_tokens = tokenizer.convert_ids_to_tokens(input_dict[“input_ids”].numpy()[0])
answer = ’ '.join(all_tokens[tf.math.argmax(start_logits, 1)[0] : tf.math.argmax(end_logits, 1)[0]+1])
print(answer)

Output: I get different and incorrect output every time I run it so it seems it doesn’t have any pre-trained weights for the QnA tasks [Also getting warning as below].

Warning:
Some layers from the model checkpoint at google/electra-small-discriminator were not used when initializing TFElectraForQuestionAnswering: [‘discriminator_predictions’]

  • This IS expected if you are initializing TFElectraForQuestionAnswering from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
  • This IS NOT expected if you are initializing TFElectraForQuestionAnswering from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
    Some layers of TFElectraForQuestionAnswering were not initialized from the model checkpoint at google/electra-small-discriminator and are newly initialized: [‘qa_outputs’]
    You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

You are using a model that is not fine-tuned on question answering, so it’s initialized with random weights in the question answering head, which is why you get the warning, and get incorrect results that change at every run.

You should pick a model on the hub fine-tuned on squad (see list here) for instance distilbert-base-cased-distilled-squad.

1 Like

Thanks @sgugger …I understood the issue.