I am working on a cloth swapping project, in which I have used GroundedSAM (to fetch cloth masks) and then use Stable Diffusion 3 pipeline with Controlnet 3 (supported for SD3).
The problem is while I am creating the pipeline object (pipe) for the SD3 (pre-trained) and map the pipe to my available device (“cuda”) in my case, it is throwing the error Cuda out of memory.
I have tried to ommitt the cuda map and run the next parts but it is showing that different outputs in different devices. Must be on same device. I tried the model in Google Colab (Pay as you go: 16 GB VRAM)and also in RunPod 24GB VRAM. In both the cases, the same error is generating for me. I am attaching a small reproducable code for this.
Note: RoboFlow GroundedSAM is also running in the same environment
I am also attaching a picture of my error: (This error originates after the SD3_pipeline).
OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacty of 23.64 GiB of which 2.81 MiB is free. Process 2832761 has 23.63 GiB memory in use. Of the allocated memory 22.72 GiB is allocated by PyTorch, and 456.70 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Output is truncated.
I am using the RoboFlow GroundedSAM (only can run in Google Colab). At this situation suggests some solutions. Should I go for more powerful VRAMs like 32 GB in Runpod, or there is any other issue?
I have tried the methods like cache removal and setting cuda storage but they didn’t work.