Streamlit App Faster Model Loading

An open question for anyone that has used Transformer models in a Streamlit app. I am using:::

pipeline(“summarization”, model=“sshleifer/distilbart-cnn-6-6”, tokenizer=“sshleifer/distilbart-cnn-6-6”,framework=“pt”)

::: to do summarization in the app. However, it takes about 55 seconds to create the summary, and it appears that 35 seconds or more of that time is spent downloading the model. Is there another way to access the model quicker? Perhaps by pre-loading the model to Streamlit Sharing (via the github repo the app sits in)?

Also, the summary generation part of the app appears to work once or twice, but if done any more times the app crashes. Has anyone else had this experience?

No experience with Streamlit itself, but you can always download the model locally. Usage is a bit different then: you need to provide a directory to the model argument instead of just the model name. So download all those files to a directory, and then use that directory as your arguments.

You should wrap the loading of model/pipeline and add a streamlit.cache decorator. That way, the loading/downloading part will be done only once.

https://docs.streamlit.io/en/stable/api.html#optimize-performance

1 Like

Within the streamlit.cache() decorator you’ll get better performance if you use allow_output_mutation=True because this means Streamlit just uses the same copy of the model in memory, rather than reloading when it’s re-run.