How to pass large context to pipeline once instead of again and again for each query?

NDugar · February 6, 2025, 7:32am

i am trying to make a database query generator for my schema and to get accurate results i pass some user assistant interactions so that it follow the format of queries.

schema = # database schema
messages =[    {"role": "user", "content": "Find John"},
    {"role": "assistant", "content": "{\"response.someset.subset.match_name\": \"John\"}"} # and like 10 more examples like this
]
current_directory = os.path.dirname("/workspace/Qwen2.5-Coder-1.5B-Instruct/")
model = AutoModelForCausalLM.from_pretrained( 
        current_directory,#"microsoft/Phi-3.5-mini-4k-instruct",  
        device_map="cuda",  
        torch_dtype="auto",  
        trust_remote_code=True,  
    ) 
#tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3.5-mini-4k-instruct") 
tokenizer = AutoTokenizer.from_pretrained(current_directory) 
print("starting")
pipe = pipeline( 
        "text-generation", 
        model=model, 
        tokenizer=tokenizer, 
    ) 
messages.append()#the query i want to run
output = pipe(messages, **generation_args)

Now the difficulty is that passing a long schema and so many messages increase my time a ton, is there a way to pass it once and for it to remember without passing again and again ?

Topic		Replies	Views
How to use the question-answering pipeline in batch mode? Beginners	0	403	July 12, 2022
Pipelines for Chat Generation with Memory Beginners	3	3562	March 15, 2024
Provide examples to model before inferencing and how to cache the examples Beginners	0	20	March 5, 2025
Prompt caching in pipelines 🤗Transformers	1	48	May 27, 2025
How does the pipeline deal with too long sequences? Beginners	3	85	January 17, 2025

How to pass large context to pipeline once instead of again and again for each query?

Related topics