How to load weights for Pre-trained model for Question Answering?

PremalMatalia · April 3, 2021, 12:18pm

What should I change in below snippet to get consistent and accurate output?

Code Snippet:
from transformers import ElectraTokenizer, TFElectraForQuestionAnswering, ElectraConfig
import tensorflow as tf

configuration = ElectraConfig()
tokenizer = ElectraTokenizer.from_pretrained(‘google/electra-small-discriminator’)
TFElect = TFElectraForQuestionAnswering(configuration)
#model = TFElectraForQuestionAnswering.from_pretrained(‘google/electra-small-discriminator’)
model = TFElect.from_pretrained(‘google/electra-small-discriminator’)

question, text = “Who was Jim Henson?”, “Jim Henson was a nice puppet”
input_dict = tokenizer(question, text, return_tensors=‘tf’)
outputs = model(input_dict,return_dict=True)
#print(outputs)
start_logits = outputs.start_logits
end_logits = outputs.end_logits
all_tokens = tokenizer.convert_ids_to_tokens(input_dict[“input_ids”].numpy()[0])
answer = ’ '.join(all_tokens[tf.math.argmax(start_logits, 1)[0] : tf.math.argmax(end_logits, 1)[0]+1])
print(answer)

Output: I get different and incorrect output every time I run it so it seems it doesn’t have any pre-trained weights for the QnA tasks [Also getting warning as below].

Warning:
Some layers from the model checkpoint at google/electra-small-discriminator were not used when initializing TFElectraForQuestionAnswering: [‘discriminator_predictions’]

This IS expected if you are initializing TFElectraForQuestionAnswering from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
This IS NOT expected if you are initializing TFElectraForQuestionAnswering from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some layers of TFElectraForQuestionAnswering were not initialized from the model checkpoint at google/electra-small-discriminator and are newly initialized: [‘qa_outputs’]
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

sgugger · April 5, 2021, 1:22pm

You are using a model that is not fine-tuned on question answering, so it’s initialized with random weights in the question answering head, which is why you get the warning, and get incorrect results that change at every run.

You should pick a model on the hub fine-tuned on squad (see list here) for instance distilbert-base-cased-distilled-squad.

PremalMatalia · April 9, 2021, 5:34pm

Thanks @sgugger …I understood the issue.

Topic		Replies	Views
Can't run fine-tuning model? Models	1	331	May 19, 2023
Uninitiallized weights with supposed correct architecture Models	1	330	October 6, 2023
Loading pytorch_pretrained_bert models with transformers Beginners	2	1898	April 29, 2021
Using `TFBertTokenizer` instead of `BertTokenizer` with `TFBertForQuestionAnswering` 🤗Tokenizers	1	1252	November 15, 2022
Loading pretrained weights into model for sequence classifcation Beginners	2	484	July 22, 2020

How to load weights for Pre-trained model for Question Answering?

Related topics