Load pre-trained models inside containerized pipeline for multi-lingual translation

Use case specification: I am building a containerized docker image that use pretrained model from Helsinki-NLP to be run in a fire-walled server (thus cannot download directly). I plan to use this image for multiple language models.

Based on the “offline” model documentation, I am contemplating between the following approaches: Should I download and save all the language models I will be using when building the docked image, (which could make the docker image unnecessarily large with 300MB per model ) or is there a way to mirror the language model repos and direct transformers to load models/tokenizers from some on-prem hosted location when called? :hugs: