Persistent models

MikeyBeez · August 28, 2022, 1:39am

How can I get my model to stay loaded in memory? I think it’s loading every time I run this pipeline:

from transformers import pipeline
import time
start = time.time()
print("Time elapsed on working...")
#generator = pipeline('text-generation', model='bigscience/bloom-560m')
#generator = pipeline('text-generation', model='gpt2')
generator = pipeline('text-generation', model='EleutherAI/gpt-neo-1.3B')
#generator = pipeline('text-generation', model='EleutherAI/gpt-j-6B')
text = generator("Albert Einstein was:", max_length=100, num_return_sequences=1)
print(text)
time.sleep(0.9)
end = time.time()
print("Time consumed in working: ",end - start)

MikeyBeez · August 28, 2022, 1:40am

Or alternatively, can I run a model as a service?

MikeyBeez · August 28, 2022, 2:07am

Nevermind. I solved it.

MikeyBeez · August 29, 2022, 5:52am

I want to share the code that solves this. Here I save about 50% of the time on my second query.

from transformers import pipeline
import time
start = time.time()
print("Time elapsed on working...")
#generator = pipeline('text-generation', model='bigscience/bloom-560m')
#generator = pipeline('text-generation', model='gpt2')
generator = pipeline('text-generation', model='EleutherAI/gpt-neo-1.3B')
#generator = pipeline('text-generation', model='EleutherAI/gpt-j-6B')
text = generator("Albert Einstein was:", max_length=100, num_return_sequences=1)
print(text)
time.sleep(0.9)
end = time.time()
print("Time consumed in working: ",end - start)
text = generator("Albert Einstein was:", max_length=100, num_return_sequences=1)
print(text)
time.sleep(0.9)
end = time.time()
print("Time consumed in working: ",end - start)

Topic		Replies	Views
How is memory managed when loading a model? Beginners	2	6213	July 4, 2023
Text Generation output keep repeat input sentences. Am I missing somethings Beginners	3	924	May 31, 2024
Does the model load on the memory? Beginners	2	522	January 10, 2024
General question about large model loading 🤗Accelerate	2	917	November 28, 2024
Bloom-560m pipeline task parameter invalid Models	3	1084	September 23, 2022

Persistent models

Related topics