I’m new to using Huggingface’s inference API and wanted to check if a model whose task is to return Sentence Similarity can return sentence embeddings instead.
For example, in this sentence-transformers model, the model task is to return sentence similarity. Instead, I would like to just get the embeddings of a list of sentences.
Is there an API parameter I can tweak to get this? Help would be very much appreciated!
Yes, you can compute sentence embeddings. Here is an example
import requests
API_URL = "https://api-inference.huggingface.co/pipeline/feature-extraction/sentence-transformers/all-mpnet-base-v2"
headers = {"Authorization": "Bearer API_TOKEN"}
def query(payload):
response = requests.post(API_URL, headers=headers, json=payload)
return response.json()
output = query({
"inputs": ["this is a sentence", "this is another sentence"]
})
# Output is a list of 2 embeddings, each of 768 values.
Since it looks like the main difference here is the addition of pipeline/feature-extraction in the url, how would you replicate this approach on a deployed inference endpoint?
– i.e.
“https:some-custom-key.region.aws.endpoints.huggingface.cloud”