I trained, evaluate, and saved my model: model_bert.save_pretrained("fine_tuned_model")
.
Now I want to perform inference so I load the using pipeline: clf = pipeline("text-classification", "/content/fine_tuned_model")
.
Then I pass an array of text to the model: clf(tx,return_all_scores=True)
Then I get this result:
Disabling tokenizer parallelism, we're using DataLoader multithreading already
[[{'label': 'LABEL_0', 'score': 0.9765037894248962},
{'label': 'LABEL_1', 'score': 0.30014175176620483},
{'label': 'LABEL_2', 'score': 0.9280667901039124},
{'label': 'LABEL_3', 'score': 0.06726877391338348},
{'label': 'LABEL_4', 'score': 0.8652555346488953},
{'label': 'LABEL_5', 'score': 0.15145337581634521}],
[{'label': 'LABEL_0', 'score': 0.8798120021820068},
{'label': 'LABEL_1', 'score': 0.006957885809242725},
{'label': 'LABEL_2', 'score': 0.06591048091650009},
{'label': 'LABEL_3', 'score': 0.0038158840034157038},
{'label': 'LABEL_4', 'score': 0.06268540769815445},
{'label': 'LABEL_5', 'score': 0.04680700972676277}]]
My questions are:
- Is possible to get the actual labels and not LABEL_0?
- And also wants the best way of evaluating your model when you have a csv file of text, is there a special function or?