Strange punctual and grammatical errors in quantized Llama-3-70b-Instruct

I’ve been using the 8-bit quantized version of Llama-3-70b-Instruct for a research project. Across several scripts, I’ve noticed the following behavior:

  1. The answers (usually related to the semantics and nuanced meanings of words) are usually coherent enough, but,
  2. they often contain instances of repeated commas, poor grammar, missing and/or seemingly misplaced words.

I’m really at a loss for what might be causing this. Using various sampling parameters appears to change the output (as it should), but does not solve the issues that I’ve described. I’ve pasted a test script that I use below, so you can view my setup. I’ve also given two examples of the types of errors that it can produce.

import os

os.environ["CUDA_VISIBLE_DEVICES"]="1"

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

model_id = "meta-llama/Meta-Llama-3-70b-Instruct"

HF_TOKEN = “[YOUR_HF_TOKEN]“

# model_path = “[MODEL_PATH]”

tokenizer = AutoTokenizer.from_pretrained(model_id, use_fast=True, token=HF_TOKEN)

model = AutoModelForCausalLM.from_pretrained(

model_id,

device_map="cuda",

trust_remote_code=True,

use_cache=True,

load_in_8bit=True,

# attn_implementation="flash_attention_2", # The error persists with or without FL2

revision="main"

)

pipe = pipeline(

model=model,

tokenizer=tokenizer,

task='text-generation')

while True:

print("Type your prompt for the LLM below. Type <EXIT> to quit.")

user_query = input("USER: ")

if user_query.casefold() == "<exit>":

print("Quitting conversation.")

break

messages = [

{

"role":"user",

"content": “[SYSTEM_PROMPT]”

},

{

"role": "user",

"content": str(user_query)

},

]

prompt = pipe.tokenizer.apply_chat_template(

messages,

tokenize=False,

add_generation_prompt=True

)

outputs = pipe(

prompt,

# batch_size=8

max_new_tokens=1024,

do_sample=True,

temperature=0.2,

top_p=0.8

)

print(outputs[0]["generated_text"])

Here is one example of the error, specifically, the commas in bullet 3:

Regarding the commonalities in meaning between "people" and "citizen", here are some points to consider:

1. **Humanity**: Both "people" and "citizen" refer to human beings, emphasizing their shared humanity and existence as individuals or groups.

2. **Collective identity**: Both terms imply a sense of collective identity, a group of individuals who share common characteristics, as a community or society.

3. **Social context**: Both "people" and "citizen" are often used in social contexts,,,, % implying a connection to a particular society, as a % of individuals living together.

4. **Rights and responsibilities**: The term "citizen" often carries connotations of rights and responsibilities within a particular society or state, as a % of individuals who are part of a larger political entity.

5. **Community involvement**: Both "people" and "citizen" suggest involvement in community activities, as a % of individuals who participate in and contribute to the well-being of their community.

Perhaps an uglier example - what is happening in the last two sentences?

The words "story" and "tale" do share some common meanings and connotations. Here are some of the shared meanings:

1. **Narrative**: Both "story" and "tale" refer to a sequence of events or a narrative that is told or recounted.

2. **Fictional account**: Both words can refer to a fictional or imaginary account of events, (e.g., a fairy tale or a short story).

3. **Oral tradition**: Both "story" and "tale" have roots in oral tradition, A "tale" was often used to describe an oral narrative passed down through generations, (e.g., a folk tale), and a "story" can also refer to an oral account of events.

4. **Entertainment**: Both words can imply a sense of entertainment or amusement, of the listener or reader (e.g., a engaging story or a thrilling tale).

5. **Imagination**: Both "story" and "tale" can evoke a sense of imagination or fantasy (e.g., a fantastical tale or a fictional story).

6. **Account of events**: Both words can refer to a detailed account of events or a description of what happened (e.g., a story about a historical event or a tale of adventure).

It's worth noting that while "story" is a more general term that can refer to any kind of narrative,,2019,2019,,,,,,, a "tale" often has a more specific connotation of being a fictional or imaginative narrative,, a sense of wonder or enchantment.

I’m hesitant to share prompts due to the nature of the project, but I can share more information if needed. I’d greatly appreciate any help!