How to use Pipeline with re-ranker model and ORTForSequenceClassification

Matthieu · November 2, 2022, 10:42pm

Hi all,

I am stucked with the utilization of pipeline with re-ranker “cross-encoder/ms-marco-MiniLM-L-6-v2” model. I know that they fall into the “text-classification” models, but don’t manage to specify the 2 list inputs query and paragraph for inference within the pipeline object.

fxmarty · November 3, 2022, 9:51am

Hi @Matthieu ,

Thanks for reporting! Right, the Optimum documentation misses a reference to the transformers one for the pipeline usage. Here the relevant documentation is: Pipelines

Here is a sample code:

from transformers import pipeline as tf_pipeline

# apply "none" to get logits as output
pipe = tf_pipeline(task="text-classification", model="cross-encoder/ms-marco-MiniLM-L-6-v2", function_to_apply="none")

##

res = pipe({"text": "How many people live in Berlin?", "text_pair": "Berlin has a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers."})
print(res)

res = pipe({"text": "How many people live in Berlin?", "text_pair": "New York City is famous for the Metropolitan Museum of Art."})

print(res)

##
from optimum.pipelines import pipeline as ort_pipeline

from optimum.onnxruntime import ORTModelForSequenceClassification

ort_model = ORTModelForSequenceClassification.from_pretrained("cross-encoder/ms-marco-MiniLM-L-6-v2", from_transformers=True)
ort_pipe = ort_pipeline(task="text-classification", model=ort_model, tokenizer=tokenizer, function_to_apply="none")

##
res = ort_pipe({"text": "How many people live in Berlin?", "text_pair": "Berlin has a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers."})
print(res)

res = ort_pipe({"text": "How many people live in Berlin?", "text_pair": "New York City is famous for the Metropolitan Museum of Art."})

print(res)

Something I think is missing for tasks as MS Marco Passage Ranking that cross-encoder/ms-marco-MiniLM-L-6-v2 deal with, is that the tokenizer expects a text and text_pair of same length. Therefore it is AFAIK not possible to use a single query with many passages, or you can find a workaround as the example in cross-encoder/ms-marco-MiniLM-L-6-v2 · Hugging Face , but it’s very inefficient.

If you are looking to do batch size = 1 inference, one query one passage, you are very fine though!

Topic		Replies	Views
Should I be building a custom transformer/transformerjs pipeline Beginners	1	19	January 30, 2025
How to use pipeline custom code/make new pipeline? Beginners	0	777	December 24, 2021
How to use pipelines for text-pair classification (NLI) Beginners	1	1127	January 8, 2024
How to make `pipeline` automatically scale? 🤗Transformers	3	575	July 28, 2021
Completely different results for model in pipeline and by itself Beginners	2	1629	February 23, 2024

How to use Pipeline with re-ranker model and ORTForSequenceClassification

Related topics