How to configure GPU server-side batching with SageMaker HF Hosting?

How to configure GPU server-side batching with SageMaker HF Hosting?
I want to be able to process multiple GPU inferences in one single batched call (vs doing CPU-GPU travel at every inference). MMS supports that in its open-source flavor, wondering how we can use it with SM HF?

Not sure how the MMS side of things works maybe you can ask at Issues · awslabs/multi-model-server · GitHub.
But once MMS is configured it depends on the task you are using it might be possible that you need to create some custom logic using a inference.py and overwriting the input_fn or outpuf_fn.