Batch Inference of NLLB Models with different source languages.
Need help in inferencing NLLB models for batch inference where the source language can change. Pipeline inference is slow even on GPU. 10k examples of various languages (simple example inference) - 6 hours, batch inference of 4.5k examples of one language took 1.6 hrs.
@guillaumekln Thanks for the reply. Awesome repo, I looked around and found it useful in understanding translation steps. But can you share an example for translation of 50/100 examples in one go and I can time it as well. I changed your given example in the repo for a batching perspective but couldn’t figure out that problem.