Image-To-Text task on Inference Endpoint

Hello,
Is there any alternate solution to this?