Hello,
I’m loading two models
onto two GPUs
. Everything is working fine: a generator model on "cuda:0"
and a classificatior model on "cuda:1"
. The issue is that for each prompt, I get two generations and two classifications, instead of only one… If someone experienced it before and knows some hints about how to solve it, please let me know
1 Like
Hi there!
It sounds like your models are running twice instead of just once per prompt—super frustrating, but let’s figure it out! Here are a few things to check:
-
Double-check your code flow – Maybe the models are accidentally getting called twice? Try adding some print statements to see what’s happening.
-
Look at your DataLoader (if you’re using one) – Sometimes, it can create extra copies of the data, leading to duplicate runs.
-
Check multiprocessing settings – If your script is running in multiple processes, it might be causing this issue.
-
Make sure the models are on the right GPUs – Just confirm that inputs and outputs are going to cuda:0
for generation and cuda:1
for classification.
-
Look for extra loops – If your function is inside another loop or batch process, that could be making it run twice.
If you can share a bit of your code, we can dig into it together! 
1 Like