I tried to use the question and answering model.
distilbert-base-uncased-distilled-squad Both give an errors.
This is what I tried to code
from transformers import DistilBertTokenizer, TFDistilBertForQuestionAnswering import tensorflow as tf tokenizer = DistilBertTokenizer.from_pretrained("bert-base-uncased") model = TFDistilBertForQuestionAnswering.from_pretrained("bert-base-uncased") question, text = "Who was Jim Henson?", "Jim Henson was a nice puppet" inputs = tokenizer(question, text, return_tensors="tf") outputs = model(**inputs) answer_start_index = int(tf.math.argmax(outputs.start_logits, axis=-1)) answer_end_index = int(tf.math.argmax(outputs.end_logits, axis=-1)) predict_answer_tokens = inputs.input_ids[0, answer_start_index : answer_end_index + 1] tokenizer.decode(predict_answer_tokens)
Allocation of 93763584 exceeds 10% of free system memory.
All model checkpoint layers were used when initializing TFDistilBertForQuestionAnswering.
All the layers of TFDistilBertForQuestionAnswering were initialized from the model checkpoint at distilbert-base-uncased-distilled-squad.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFDistilBertForQuestionAnswering for predictions without further training.
Some layers from the model checkpoint at bert-base-uncased were not used when initializing TFDistilBertForQuestionAnswering: [‘mlm___cls’, ‘nsp___cls’, ‘bert’]
- This IS expected if you are initializing TFDistilBertForQuestionAnswering from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFDistilBertForQuestionAnswering from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some layers of TFDistilBertForQuestionAnswering were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: [‘distilbert’, ‘qa_outputs’, ‘dropout_37’]
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.