Allow Multiple Processes at Once

Hello! New to HF here, and fairly new to ML.

I put together (through a fair amount of trial and error) a handler.py for Microsoft’s SpeechT5 TTS, on a copy I made.

However, it seems to only process one connection at once, and the other waits. It looks like, on the Inference Endpoint, it is only using a single core, and 2-3 GB out of 16 GB available at a time. Is there a way to allow this to use multiple cores / instances?

https://huggingface.co/Dupaja/speecht5_tts/blob/main/handler.py is the file, for reference.

Thanks!