I’m running mixtral 7b for the first time and running into some difficulties, basically for a multitude of reasons, the output is not as expected.
outputs are supposed to be structured jsons and it works some of the times, but in other situations i get strange sequences of ‘\xa0’ that mess up the parsing of the jsons. In other cases i simply don’t get a end of dict character in the end.
for example here’s a snippet from my output:
'\t{\n"pairs": [\n\xa0\xa0\xa0\xa0{\n\xa0\xa0\xa0\xa0\xa0\xa0"suggestions": "how about another drink?"\n\xa0\xa0\xa0\xa0},'
it’s important to say that sometimes this works well, meaning sometimes i don’t see any ‘\ax0’ in the outputs and the parsing works.
another thing to note is that I’m getting this warning (maybe its related):
Setting \
pad_token_idto
eos_token_id:2 for open-end generation.
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set \
padding_side=‘left’ when initializing the tokenizer.
finally, here is the configuration i use:
model_id = "mistralai/Mixtral-8x7B-Instruct-v0.1"
model = transformers.AutoModelForCausalLM.from_pretrained(
model_id,
trust_remote_code=True,
torch_dtype=bfloat16,
device_map='auto'
)
model.eval()
tokenizer = transformers.AutoTokenizer.from_pretrained(model_id)
pipeline = transformers.pipeline(
model=model, tokenizer=tokenizer,
return_full_text=False,
task="text-generation",
temperature=0.1,
top_p=0.15,
top_k=0,
max_new_tokens=512,
repetition_penalty=1.1
)
the warning is displayed regardless if i set the tokenizer padding to left or not, and I’m also running this though a langchain class that handles hugging face pipelines