Incomplete/ partial response generation

it-chameleoncx · October 7, 2023, 6:21pm

I am using langchain and huggingfacehub with model llama2 like

HuggingFaceService.llm = HuggingFaceHub(repo_id=“mistralai/Mistral-7B-v0.1”, model_kwargs={ ‘temperature’: 0.001, “max_length”:1000000000 })
Below is my prompt, but its not generating the complete response. It’s just returning this : {
“result”: “\n\nThere are two types of workflows that can be created in the platform which include:”
}

How can I get the complete usable response, I became a pro user as this model is available for pro users only.

Entering new LLMChain chain…
Prompt after formatting:
Use the following pieces of context to answer the question at the end. If you don’t know the answer, just say that you don’t know, don’t try to make up an answer.

[Document(page_content='Types of workflow in the DigitalChameleon platform There are two types of workflows that can be created in the platform which include: 1.\tConversation: A series of nodes with questions or text displayed to the customer in a sequence one by one, to capture the response of Customer, is referred to as Conversation workflow. The nodes of a workflow of conversation type are loaded on the webpage to the customer one at a time. The flow can be modified to return to a previous flow or allow customer to resume work at a later point in time. Workflow will go to the next node only when the customer performs the desired action in the previous node as configured in the workflow. 2.\tForm: A one time loading of the nodes/questions/messages to the end customer all at once in the UI of a form. The form will be created in the similar manner as we create for conversation in the CMS except for the workflow type in the journey properties should be selected as Form while creating/copying the workflow.

Question: Types of workflow in the DigitalChameleon platform
Helpful Answer:

MattiLinnanvuori · October 8, 2023, 5:54am

Have you tried the instruct model? https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1

it-chameleoncx · October 9, 2023, 6:40am

ANSWER!!

It was because the max_new_tokens is 20 by default. By increasing the value, I was able to get the full length response.

phosseini · March 27, 2024, 11:14pm

Increasing the maximum token helps in some but not all cases.

Topic		Replies	Views
Text generation models are producing incomplete response although max_new_tokens set to a high value Models	1	472	August 2, 2024
Incomplete response from chatbot Models	0	969	July 25, 2023
Text Generation response truncation Beginners	6	1395	August 18, 2024
Inference API detailed request Beginners	5	2364	September 11, 2020
How to stop LLM from going up to the max token limit? Intermediate	1	150	September 25, 2024

Incomplete/ partial response generation

Related topics