I am running the code through your documentation for named entity recognition and am trying to save this “ner” model locally:
from transformers import pipeline nlp = pipeline("ner") sequence = "Hugging Face Inc. is a company based in New York City. Its headquarters are in DUMBO, therefore very" \ "close to the Manhattan Bridge which is visible from the window." nlp.save_pretrained("path to folder")
When going to load this model up and make predictions, I am getting the error: “IndexError: list index out of range” pointing the very last line below:
model = AutoModelForTokenClassification.from_pretrained("path to folder") tokenizer = AutoTokenizer.from_pretrained("path to folder") label_list = [ "O", # Outside of a named entity "B-MISC", # Beginning of a miscellaneous entity right after another miscellaneous entity "I-MISC", # Miscellaneous entity "B-PER", # Beginning of a person's name right after another person's name "I-PER", # Person's name "B-ORG", # Beginning of an organisation right after another organisation "I-ORG", # Organisation "B-LOC", # Beginning of a location right after another location "I-LOC" # Location ] sequence = "Hugging Face Inc. is a company based in New York City. Its headquarters are in DUMBO, therefore very" \ "close to the Manhattan Bridge." # Bit of a hack to get the tokens with the special tokens tokens = tokenizer.tokenize(tokenizer.decode(tokenizer.encode(sequence))) inputs = tokenizer.encode(sequence, return_tensors="pt") outputs = model(inputs) predictions = torch.argmax(outputs, dim=2) print([(token, label_list[prediction]) for token, prediction in zip(tokens, predictions.tolist())])
I would like to get the entity for each token. I believe that the error is with “label_list” portion of the code, and I ran the following this has the token along with the prediction represented as integers:
print([(token,prediction) for token, prediction in zip(tokens, predictions.tolist())])
I am unable to recreate the output shown on the website due to that error. Any help would be much appreciated.