You have a few different options, here are some in increasing level of difficulty
- You can use the Hugging Face Inference API via Model Hub if you are just looking for a demo.
- You can use a hosted model deployment platform: GCP AI predictions, SageMaker, https://modelzoo.dev/. Full disclaimer, I am the developer behind Model Zoo, happy to give you some credits for experimentation.
- You can roll your own model server with something like https://fastapi.tiangolo.com/ and deploy it on a generic serving platform like AWS Elastic Beanstalk or Heroku. This is the most flexible option.