Making llama text generation, deterministic

rachith · August 11, 2023, 1:52pm

Following the text generation code template here, I’ve been trying to generate some outputs from llama2 but running into stochastic generations.
For instance, running the same prompt through the model.generate() twice results in two different outputs as shown in the example below.

I’ve used model.generate() with other LLMs (e.g., flant5) with the other parameters remaining the same and have obtained deterministic outputs.
Also tried AutoModelForCausalLM instead of LLamaForCausalLM but still got different outputs each time for the same prompt.

How do I make sure I get the same text generated each time?

Code to reproduce:

from transformers import AutoTokenizer, LlamaForCausalLM

model_name = "meta-llama/Llama-2-13b-chat-hf"
tokenizer = AutoTokenizer.from_pretrained(model_name, cache_dir="/data2/racball/llms")
model = LlamaForCausalLM.from_pretrained(
    model_name,
    cache_dir="/data2/racball/llms",
    device_map = "sequential",
)

prompt = "What is up?"
inputs = tokenizer(prompt, return_tensors="pt")
# Generate
generate_ids = model.generate(inputs.input_ids, max_length=30)

tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]

# Run1: 'What is up?\n\nI have a problem with my `docker-compose.yml` file. I have a service that should run a'
# Run2: "What is up?\n\nIt's been a while since I've posted, but I've been pretty busy with work and other"

rachith · August 16, 2023, 2:02am

Resolved [here]
(Is non-determinism in outputs generated by LlamaForCausalLM, the expected behavior? · Issue #25507 · huggingface/transformers · GitHub)

TLDR: set do_sample = False in model.generate(inputs.input_ids, max_length=30) as in model.generate(inputs.input_ids, max_length=30,do_sample=False)

Topic		Replies	Views
Text Generation output keep repeat input sentences. Am I missing somethings Beginners	3	931	May 31, 2024
Llama-2 output from forward function is nonsense, `.generate()` is okay 🤗Transformers	3	1044	May 27, 2024
Llama 2 repeats its prompt as output without answering the prompt 🤗Transformers	3	3664	September 30, 2024
Unnecesarry output generated form LLama3-Instruct model CausalLM Beginners	1	164	November 17, 2024
My fine tuned model behaves differently each run Beginners	1	25	November 29, 2024

Making llama text generation, deterministic

Related topics