Get all labels / entity groups available to a model

jeril · May 17, 2023, 8:39am

I have the following code to get the named entity values from a given text:

from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline

tokenizer = AutoTokenizer.from_pretrained("Davlan/distilbert-base-multilingual-cased-ner-hrl")
model = AutoModelForTokenClassification.from_pretrained("Davlan/distilbert-base-multilingual-cased-ner-hrl")
nlp = pipeline("ner", model=model, tokenizer=tokenizer, aggregation_strategy="max")

example = "My name is Johnathan Smith and I work at Apple"
ner_results = nlp(example)
print(ner_results)

The following is the output:

[{'end': 26,
  'entity_group': 'PER',
  'score': 0.9994689,
  'start': 11,
  'word': 'Johnathan Smith'},
 {'end': 46,
  'entity_group': 'ORG',
  'score': 0.9983876,
  'start': 41,
  'word': 'Apple'}]

In the above example the labels / entitiy groups are ORG and PER. How to find all the labels / entitiy groups available?
Kindly advise.

jmtk · November 30, 2023, 1:20pm

I found this which was semi-helpful although there’s not really a great explanation for them. This was in the config.json in the root of the model files

Topic		Replies	Views
Decoding the predicted output array in distilbertbase uncased model for NER 🤗Transformers	1	7368	October 11, 2021
Unable to get NER tags from "ner" pipeline? Beginners	0	521	October 7, 2020
Application of a transformer model without fine tuning for NER task Beginners	2	1328	May 31, 2021
T5 for Named Entity Recognition 🤗Transformers	2	6328	November 24, 2020
Inconsistency in Model Output [ Token Classification] 🤗Transformers	0	333	April 12, 2023

Get all labels / entity groups available to a model

Related topics