Hello, Im trying to get jjzha/esco-xlm-roberta-large · Hugging Face to work, but it seems Im missing or misconfiguring something. (Im quite new to huggingface models.)
In the model card of RobertaForCustomMaskedLM is used but I cant find this within transformers lib (my version is 4.31.0). I couldnt find anything in the net regarding RobertaForCustomMaskedLM is it possible that this is created only for this model and I need to get it from somewhere else?
I could at least run the model, using XLMRobertaForMaskedLM
But Im getting strange labels for my skill analysis this way.
This is the complete code, and I only see the label “O” which is an error label or?
Is this because Im not using RobertaForCustomMaskedLM ? Or did I make another mistake?
from transformers import AutoTokenizer, XLMRobertaForMaskedLM
import torch
tokenizer = AutoTokenizer.from_pretrained("jjzha/esco-xlm-roberta-large")
model = XLMRobertaForMaskedLM.from_pretrained("jjzha/esco-xlm-roberta-large")
# Example text for prediction
text = "The car is cool. Angular and node.js are frameworks."
max_length = 128
# Tokenize the input text
inputs = tokenizer.encode_plus(text,
add_special_tokens = True,
truncation = True,
max_length=max_length,
padding = "max_length",
return_attention_mask = True,
return_tensors = "pt")
# Get the list of tokens from the input text
tokens = tokenizer.convert_ids_to_tokens(inputs["input_ids"][0])
# Make predictions using the model
outputs = model(**inputs)
# Get the predicted token-level labels
predicted_labels = torch.argmax(outputs.logits, dim=-1)[0]
# Get the list of labels from the model's configuration
label_list = list(model.config.id2label.values())
for label in label_list:
print(f"Label: {label}")
# Create a list to store the tokens with their corresponding labels
tokens_with_labels = []
# Iterate through the tokens and their predicted labels
for token, label_id in zip(tokens, predicted_labels):
# Ensure the label_id is within the valid range of labels
print(f"{label_id.item()} => LabelId")
if label_id.item() <= len(label_list) - 1:
label = label_list[label_id.item()]
else:
label = "O" # Assign a special label for out-of-range IDs (e.g., "O" for no entity)
tokens_with_labels.append((token, label))
# Print the tokens with their predicted labels
for token, label in tokens_with_labels:
print(f"{token}: {label}")