How to perform parallel inference using multiple GPU

Hi, is there a way to create an instance of LLM and load that model into two different GPUs? Note that the instance will be created in two different celery tasks (asynchronous task/job)

Distributed inference with multiple GPUs (huggingface.co)

I went through the documentation, but I still don’t know, how exactly will I be able to handle both the response from the “result”?