A question about setting up an internal AI server

Hello,
I want to set up an internal AI server using ComfyUI. 10 people want to connect to this server and produce photos and videos. I have some questions:

1- Can ComfyUI execute requests concurrently? Or does it create a queue?

2- What hardware specifications should my server have? How much VRAM do I need for a graphics card?

Thank you.

1 Like

When performing parallel processing with the ComfyUI API, it seems preferable to have a corresponding number of GPUs

1 Like

Hello,
Thank you so much for your reply.
You said:

# Ports 8188–8191, one process per GPU
python main.py --listen 0.0.0.0 --port 8188   # GPU 0 (set via env or launcher)
python main.py --listen 0.0.0.0 --port 8189   # GPU 1
python main.py --listen 0.0.0.0 --port 8190   # GPU 2
python main.py --listen 0.0.0.0 --port 8191   # GPU 3
# Your router picks the worker with the shortest queue and POSTs to /prompt.

1- ComfyUI executes tasks in a queue. If instead of running multiple ComfyUI services as above, I run just one service, when users simultaneously submit requests like generating a photo or video, then they have to wait in the queue until the previous tasks are finished. Right?

2- Something is strange to me. In Balanced image-first lab, the number of GPUs you suggested is 4× 24 GB (e.g., 4090-class/RTX 6000 Ada), but for a Video-heavy team, the number of GPUs you suggested is 2–4× 24–48 GB. Shouldn’t it be the vice versa? Does that mean I can do heavy video work with two GPUs with 24GB of VRAM?

3- I want to use Flux and Qwen-Image models for image generation and Wan 2.2 models for video generation. My goal is to generate videos (I2V and T2V) that are longer than 4 seconds. If my graphics card is 8 GB, how many graphics cards should I have? If it is 12 GB, how many? How much RAM and how many CPUs should my server have?