I’ve deployed cross-encoder-ms-marco-TinyBERT-L-2 model in ML studio for Inference. I’m trying to re-rank chunks as per user query. If I pass text & text pair as shown in model card for Hugging face Inference API, model is working as expected but if I pass same Inputs to ML Studio endpoint for Inference, we are not getting correct response.
import requests
# Replace with your actual Hugging Face API key
HF_API_KEY = ""
# AZ_API_KEY = ""
# The API URL for the specific cross-encoder model
HF_API_URL = "https://api-inference.huggingface.co/models/cross-encoder/ms-marco-TinyBERT-L-2"
# AZ_API_URL = 'https://docs-reranker-ypifx.eastus2.inference.ml.azure.com/score'
# The headers to include in the API request
# az_headers = {
# "Authorization": f"Bearer {AZ_API_KEY}", 'azureml-model-deployment': 'cross-encoder-ms-marco-tinyb-10'
# }
hf_headers = {
"Authorization": f"Bearer {HF_API_KEY}"
}
def classify_text_pair(query, chunks):
# The payload should be a dictionary with the "inputs" key
# The model expects a list with a single pair of texts
model_inputs = [{"text": query, "text_pair": str(passage)} for passage in chunks]
payload = {"inputs": model_inputs}
# payload = {
# "inputs": [{"text":query, "text_pair":passage}]
# }
# Send the POST request to the Hugging Face Inference API
response = requests.post(HF_API_URL, headers=hf_headers, json=payload)
# response = requests.post(AZ_API_URL, headers=az_headers, json=payload)
# Check for a successful response and return the result
if response.status_code == 200:
return response.json()
else:
raise Exception(f"Request failed with status code {response.status_code}")
# Example usage
query = "What is the capital of France?"
chunks = [
'Paris is the capital of France.',
'Spain is a country in Europe.',
'The Eiffel Tower is in Paris.',
'Many people in France speak French.',
'France is known for its wine.',
'France is a country in Europe.',
'The capital of Italy is Rome.',
'France has a population of about 67 million.',
'Paris is known for its museums and cultural sites.',
'The Seine is a river in France.'
]
# Call the function with the query and passage
result = classify_text_pair(query, chunks)
print(result)
# Result for HF endpoint - [[{'label': 'LABEL_0', 'score': 0.9326303601264954}], [{'label': 'LABEL_0', 'score': 0.0010434213327243924}], [{'label': 'LABEL_0', 'score': 0.0009471822413615882}], [{'label': 'LABEL_0', 'score': 0.003246085485443473}], [{'label': 'LABEL_0', 'score': 0.013439307920634747}], [{'label': 'LABEL_0', 'score': 0.019809093326330185}], [{'label': 'LABEL_0', 'score': 0.005551024805754423}], [{'label': 'LABEL_0',
'score': 0.006500988733023405}], [{'label': 'LABEL_0', 'score': 0.0012194315204396844}], [{'label': 'LABEL_0', 'score': 0.0014617841225117445}]]
# Result for AZ ML studio endpoint - {'label': 'LABEL_0', 'score': 0.02217939682304859}
Can someone help why it is not working for Models deployed in Azure ML Studio