I have a single GPU, and a classification task which has a lot of pre- and post-processing done in Python, so the actual GPU utilisation is between 20%-50%, depending on the individual input.
Can I run the Python code in 2 threads and rely on transformers or pytorch to do the right thing if I call the pipeline concurrently?