Issue with Inference API for ViT Model - "image-feature-extraction" Error

I’m experiencing an issue with the inference API for my Vision Transformer (ViT) model, rshrott/vit-base-renovation2.

When I attempt to use the API, I receive the following error:

“error”: "HfApiJson(Deserialize(Error(“unknown variant image-feature-extraction, expected one of audio-classification, audio-to-audio, audio-source-separation, automatic-speech-recognition, feature-extraction, text-classification, token-classification, question-answering, translation, summarization, text-generation, text2text-generation, fill-mask, zero-shot-classification, zero-shot-image-classification, conversational, table-question-answering, image-classification, image-segmentation, image-to-text, text-to-speech, … visual-question-answering, video-classification, document-question-answering, image-to-image, depth-estimation, line: 1, column: 318)))”

Interestingly, when I use the transformers pipeline directly in Python, the model works as expected:

from transformers import pipeline
from PIL import Image
import requests

pipe = pipeline(model=“rshrott/vit-base-renovation2”)
url = ‘
image =, stream=True).raw)
preds = pipe(image)

This code runs without any issues and returns the expected predictions. However, the same model encounters an error when used through the inference API. I suspect there might be a configuration issue related to the expected task type, but I’m not sure how to resolve it.

Could you please help me understand why this error is occurring and how I can fix it? I’ve checked the model card and configuration, but I can’t seem to find where “image-feature-extraction” is coming from or why it’s expected.

Thank you for your assistance!

Did you find the solution ? Because I am facing the same problem