I’ve deployed to google cloud Vertex AI using the one click deploy feature and it seems to be running fine. Unfortunately, it seems that my requests are being routed to the /embed endpoint as described here: Text Embeddings Inference API where I need it to be routed to the /similarity endpoint. Currently with a request like this:
{
“instances”: [
{
“inputs”: {
“sentences”: [
“What is Machine Learning?”
],
“source_sentence”: “What is Deep Learning?”
}
}
]
}
I’ll get a result like this:
{
“error”: “data did not match any variant of untagged enum Input”,
“error_type”: “Validation”
}
But if I change the JSON to:
{
“instances”: [
{
“inputs”: [“Test”]
}
]
}
I’ll get the expected EmbedResponse. How can I change the task performed by the call to the endpoint?
2 Likes
Hello Backez,
Thank you for your question.
Unfortunately with Vertex AI you cannot query the /similarity route as of now.
If you want more flexibility to query all the tei routes, you can deploy your endpoint to GKE and specify the route via cURL.
1 Like