Feature extraction for image with a hosted model

Is there any hosted model on Hugging Face that performs image feature extraction and returns the features vector (aka image2vec, image embedding)?

I came across this one that is documented to do just that:
florentgbelidji/blip_image_embeddings · Hugging Face

But when I called it I got an error:
{'error': ['value is not a valid list: `__root__` in `parameters`', 'str type expected: `__root__` in `parameters`']}

The test code:

import json
from typing import List
import requests as r
import base64

ENDPOINT_URL = "https://api-inference.huggingface.co/models/florentgbelidji/blip_image_embeddings"

def predict(path_to_image: str):
    with open(path_to_image, "rb") as i:
        b64 = base64.b64encode(i.read())
    payload = {"inputs": {"image": b64.decode("utf-8")}}
    response = r.post(
        ENDPOINT_URL, headers={"Authorization": f"Bearer {API_TOKEN}"}, json=payload
    )
    return response.json()


prediction = predict(path_to_image=img)
print(prediction)
1 Like

hi @avivhu ,

I don’t think we have a feature-extration from images on our community pipelines, i.e. using our API Inference. The model you’ve linked is an example of a custom pipeline for our Inference Endpoints, cc @florentgbelidji the author.
If you want it on our community API, for tests and prototypes you can open an issue here Issues · huggingface/api-inference-community · GitHub

Thanks @radames .
I opened a feature request here:
Add image feature extraction to the Inference API [img2vec] · Issue #229 · huggingface/api-inference-community (github.com)

hi @avivhu , sorry I was able to update the custom pipeline to make it work on our public API,
You can use from this model repo now, radames/blip_image_embeddings · Hugging Face

import json
from typing import List
import requests as r
import base64
ENDPOINT_URL = "https://api-inference.huggingface.co/models/radames/blip_image_embeddings"
HF_TOKEN = ""
def predict(path_to_image: str = None):
    with open(path_to_image, "rb") as i:
        b64 = base64.b64encode(i.read())
    payload = {"inputs": b64.decode("utf-8")}
    response = r.post(
        ENDPOINT_URL, headers={"X-Wait-For-Model": "true", "Authorization": f"Bearer {HF_TOKEN}"}, json=payload
    )
    return response.json()
prediction = predict(
    path_to_image="palace.jpg"
)

That works perfectly. Thank you @radames .

A post was split to a new topic: Deploying CLIP-Vit as an inference endpoint