Guide/Tutorial to write an inference endpoint for custom models

Are there any docs/guides/tutorials to write custom endpoints to be hosted on Huggingface Hub’s inference endpoints?

We’re particularly looking at these models which does things a little differently from normal Huggingface AutoModels (that’ll need A100):

And potentially also these that runs on T4: