Proper way to gather output from accelerate multi-gpu inference

Following Distributed Inference with 🤗 Accelerate,

from accelerate import PartialState  # Can also be Accelerator or AcceleratorStaet
from diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16)
distributed_state = PartialState()
pipe.to(distributed_state.device)

# Assume two processes
with distributed_state.split_between_processes(["a dog", "a cat", "a chicken"], apply_padding=True) as prompt:
    result = pipe(prompt).images

I can see how different processes create different output. However, result are distributed across multiple processes, how to properly gather them?
I think people use Accelerator.gather(). However, it seems this function only takes tensors as inputs. In my case, using LLM to generate texts with multiprocessing, I will have a list of strings, can this be gathered in some way?

1 Like

I also encountered this issue… Could anyone give a solution?