Data Parallel Multi GPU Inference

Is there a reason you want to do so instead of using device_map/big model inference? This can help narrow down my recommendation