Hi,
I trained a model using bertforsequenceclassification using a binary label system, I am trying to express some values from my test-set using the pipeline function as follows:
classifier = pipeline("sentiment-analysis", model=model,
tokenizer=tokenizer,function_to_apply="softmax")
result = classifier("Dog very ill.")[0]
#{'label': 'LABEL_1', 'score': 0.9970490336418152}
or
classifier = pipeline("sentiment-analysis", model=model,
tokenizer=tokenizer,function_to_apply="sigmoid")
result = classifier("Dog very ill.")[0]
#{'label': 'LABEL_1', 'score': 0.9585336446762085}
I am trying to get a sigmoid result (where labels closer to 0 will be between 0-0.50 and labels closer to 1 be 0.51-1) but regardless the outputs are the same whether I specify auto/sigmoid/softmax (Label: 0, Score: 0.00 - > 1.00, Label: 1, Score: 0.00-> 1.00)
Any Ideas? Many thanks in advanced
I have the solution to my problem, so stating it below just in case anyone else wanted the same thing
I would ignore the function_to_apply
and instead use return_all_scores =True
. So typically, it will return the highest scoring label + Label itself (e.g, {'label': 'LABEL_1', 'score': 0.9585...}
) , however if you are wanting a sigmoid/logisitic regression style response (where all value sit on a continious scale between 0 and 1 by setting return_all_scores =True
it will return a dictionary with [[{'label': 'LABEL_0', 'score': 0.3905504047870636}, {'label': 'LABEL_1', 'score': 0.609449565410614}]]
where the sum of both =1
By using
result = classifier("Dog very ill.")
result[0][1].get('score')
Where ‘result’ is your text inside the classifier it will spit out just the score,without the other label, and just the score we want (specififying [0][1] will highlight label_1, whereas [0][0] will do label_2
Anyways hope this helps