Speed up translation model

Hi there!

I’m using a translation model, let’s say wmt, but it translated 20 sentences per second on CUDA and that’s too slow for me, I need to translate 1 million sentences per day, or at least as close to that as possible, I can somehow speed up my code without server improvements?

Also, I’d like to use my own model, so I’ll ask if these methods will work for it, but if it’s easier to speed up existing models, that’s ok

I’m thinking about starting a translation using threading or multiprocessing, but I don’t understand them well, so these tips will be very useful

Also, is translator_model(text) thread safe?