Returning score associated with prediction_value from loaded_tokenizer

Hi. I’m trying to get the %age score corresponding to prediction_value using the following python code snippet

predict_input = loaded_tokenizer.encode(email_text, truncation=True, padding=True, return_tensors="tf")
output = loaded_model(predict_input)[0]

prediction_value = tf.argmax(output, axis=1).numpy()[0]

loaded_model and loaded_tokenizer have been trained earlier using distilbert.

I’ve tried using TextClassificationPipeline, but unfortunately “email_text” has too many tokens (more than 512) so doesn’t work. If I truncate email_text it returns incorrect prediction_value.

Thanks in advance

1 Like

I searched to see if it was some kind of bug or configuration issue, but it seems to be a deep-rooted problem.
By the way, you can pass options to the pipeline for the model and tokenizer in addition to the pipeline options. Check the model description to see what options are available. It would be easy to solve this problem.

Thank you. It haddn’t occured to me that the input could be chunked. Although it’ll make the classification slower, if it makes it more accurate, that’ll be really helpful. I’ll look into it

1 Like