I’m also DeepL. No problem.
Normally this is the way to increase max_new_tokens, but without a way to specify it on the botpress side, there is no way to do it.
@gkrishnan I’m late to the post but you can always manually pass in the model/pipeline:
from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer
from langchain.llms import HuggingFacePipeline
model = AutoModelForCausalLM.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path)
gen = pipeline('text-generation', model=model, tokenizer=tokenizer, max_new_tokens=200)
llama_llm = HuggingFacePipeline(pipeline=gen)
I know I’m late now. But for any future preferences.
what the max_length and max_new_tokens do.
In max_length we get the maximum length including the input and output tokens.
But in max_new_tokens we get the maximum output excluding the output.
Let me show you using the code
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer,AutoModelForSeq2SeqLM
torch.set_default_device("cuda")
model = AutoModelForCausalLM.from_pretrained("microsoft/phi-2", torch_dtype="auto", tru…
The botpress people seem to be on HF, so there’s always the option of sending them a direct mentions and asking them. (@+username)
1 Like