Hello,
I am running the code through your documentation for named entity recognition and am trying to save this “ner” model locally:
from transformers import pipeline
nlp = pipeline("ner")
sequence = "Hugging Face Inc. is a company based in New York City. Its headquarters are in DUMBO, therefore very" \
"close to the Manhattan Bridge which is visible from the window."
nlp.save_pretrained("path to folder")
When going to load this model up and make predictions, I am getting the error: “IndexError: list index out of range” pointing the very last line below:
model = AutoModelForTokenClassification.from_pretrained("path to folder")
tokenizer = AutoTokenizer.from_pretrained("path to folder")
label_list = [
"O", # Outside of a named entity
"B-MISC", # Beginning of a miscellaneous entity right after another miscellaneous entity
"I-MISC", # Miscellaneous entity
"B-PER", # Beginning of a person's name right after another person's name
"I-PER", # Person's name
"B-ORG", # Beginning of an organisation right after another organisation
"I-ORG", # Organisation
"B-LOC", # Beginning of a location right after another location
"I-LOC" # Location
]
sequence = "Hugging Face Inc. is a company based in New York City. Its headquarters are in DUMBO, therefore very" \
"close to the Manhattan Bridge."
# Bit of a hack to get the tokens with the special tokens
tokens = tokenizer.tokenize(tokenizer.decode(tokenizer.encode(sequence)))
inputs = tokenizer.encode(sequence, return_tensors="pt")
outputs = model(inputs)[0]
predictions = torch.argmax(outputs, dim=2)
print([(token, label_list[prediction]) for token, prediction in zip(tokens, predictions[0].tolist())])
I would like to get the entity for each token. I believe that the error is with “label_list” portion of the code, and I ran the following this has the token along with the prediction represented as integers:
print([(token,prediction) for token, prediction in zip(tokens, predictions[0].tolist())])
I am unable to recreate the output shown on the website due to that error. Any help would be much appreciated.