Easiest way to perform inference

Thethela · November 1, 2022, 7:16am

I trained, evaluate, and saved my model: model_bert.save_pretrained("fine_tuned_model").
Now I want to perform inference so I load the using pipeline: clf = pipeline("text-classification", "/content/fine_tuned_model") .

Then I pass an array of text to the model: clf(tx,return_all_scores=True)

Then I get this result:

Disabling tokenizer parallelism, we're using DataLoader multithreading already
[[{'label': 'LABEL_0', 'score': 0.9765037894248962},
  {'label': 'LABEL_1', 'score': 0.30014175176620483},
  {'label': 'LABEL_2', 'score': 0.9280667901039124},
  {'label': 'LABEL_3', 'score': 0.06726877391338348},
  {'label': 'LABEL_4', 'score': 0.8652555346488953},
  {'label': 'LABEL_5', 'score': 0.15145337581634521}],
 [{'label': 'LABEL_0', 'score': 0.8798120021820068},
  {'label': 'LABEL_1', 'score': 0.006957885809242725},
  {'label': 'LABEL_2', 'score': 0.06591048091650009},
  {'label': 'LABEL_3', 'score': 0.0038158840034157038},
  {'label': 'LABEL_4', 'score': 0.06268540769815445},
  {'label': 'LABEL_5', 'score': 0.04680700972676277}]]

My questions are:

Is possible to get the actual labels and not LABEL_0?
And also wants the best way of evaluating your model when you have a csv file of text, is there a special function or?

mapama247 · November 2, 2022, 9:52am

You see these “generic” label names because you didn’t specify the correct ones when fine-tuning the model. If you check your model’s configuration like this…

from transformers import AutoModelForTextClassification
m = AutoModelForSequenceClassification.from_pretrained("/path/to/fine_tuned_model")
print(m.config.id2label)

You should get something like:

{0: ‘LABEL_0’, 1: ‘LABEL_1’, 2: ‘LABEL_2’, 3: ‘LABEL_3’, 4: ‘LABEL_4’, 5: ‘LABEL_5’}

This is what maps the label ids (0,1,2…) to actual names, so you can simply modify this dictionary to have the label names that you want. For example:

m.config.id2label = {0: 'zero', 1: 'one', 2: 'two', 3: 'three', 4: 'four', 5: 'five'}

If you perform inference now you’ll see the new label names (zero, one, two…) in the output. You can change the “label2id” dictionary in the same way.

And about the CSV question, I don’t think there’s a special function for that… just read the file (into a dataset object, a pandas dataframe or whatever you prefer) and iteratively provide texts to your pipeline, either one by one or in batches.

Topic		Replies	Views
Predicting On New Text With Fine-Tuned Multi-Label Model Beginners	4	5157	December 23, 2021
Inference from a fine-tuned model -- help with interpretation of results Beginners	3	370	January 26, 2024
Evaluating Finetuned BERT Model for Sequence Classification Beginners	10	8489	October 25, 2022
Inference error when loading a previously trained saved model Beginners	1	1915	April 10, 2022
'Impossible to guess which tokenizer to use' while loading fine-tuned model on pipeline Beginners	1	3009	December 7, 2023

Easiest way to perform inference

Related topics