How can I keep use of the base model version for inference after fine-tuning

Hi, I have used Transformers pipeline to fine-tune a llama 2 for a chatbot .
Used the below code to create the model and tokenizer:

tokenizer= autotokenizer.from_pretrained(model_path)
hf_pipeline_obj = hf_pipeline("text-generation",model=model_path,tokenizer=tokenizer,torch_dtype=torch.bfloat16,base_model_inference=true,trust_remote_code=true,device_map="auto",max_length=1000,do_sample=true,top_k=10,num_return_sequences=1,eos_token_id=tokenizer.eos_token_id,)
llm = huggingfacepipeline(pipeline=hf_pipeline_obj, model_kwargs={'temperature': 0.7})

But the problem is when asking a general question to the created chatbot it does not reply correctly after fine-tuning so how can I keep the base model for inference even after fine-tuning?
Can you explain that please.

Thank you,

When you fine-tuned the model using LoRa, you can use the model.disable_adapters() method as explained here: Load adapters with 🤗 PEFT