Hello @plexus3d,
You can directly deploy your custom inference pipeline as an inference endpoint using a custom container. This would mean creating a model repository with your space code and then using a custom docker with gradio. Here is an example: philschmid/space-naver-donut-cord · Hugging Face.
Additionally, as you mentioned, you can create a custom inference handler. In the documentation, we have several examples on how to do this: * Optimum and ONNX Runtime
- Diffusers with stable-diffusion
- Image Embeddings with BLIP
- TrOCR for OCR Detection
- Optimized Sentence Transformers with Optimum
- Pyannote Speaker diarization
- LayoutLM
- Flair NER
- GPT-J 6B Single GPU
- Donut Document understanding
- SetFit classifier
Or you can use the API from your Spaces