I created a chatbot using langchain llamacpp with llama-2-13b-chat, and it keeps giving me incomplete responses as such: “Sure! Here’s an example that uses the case operator to assign different field values in different situations. Suppose you have”
The following are the parameters used:
llm = LlamaCpp(
model_path="models/llama-2-13b-chat.ggmlv3.q2_K.bin",
n_gpu_layers=40,
n_batch=512,
max_tokens=2000,
streaming=True,
callback_manager=callback_manager,
verbose=False,
temperature=0.75,
top_p=0.9,
top_k=40,
repeat_penalty=1.18,
f16_kv=True,
last_n_tokens_size=64,
)
I tried experimenting with different models and different parameters, yet I was still receiving incomplete responses.