Help with Llama 2 Finetuning Setup

Hello!
I’m trying to follow the Llama 2 finetunning example provided by databricks here https://github.com/databricks/databricks-ml-examples/blob/b1ca47c058461f7fde214914d53b051990064d94/llm-models/llamav2/llamav2-7b/scripts/fine_tune_deepspeed.py#L94 . However, I ran into the following issue “lib/python3.9/site-packages/transformers/generation/configuration_utils.py”, line 354, in validate
raise ValueError(
ValueError: do_sample is set to False. However, temperature is set to 0.9 – this flag is only used in sample-based generation modes. Set do_sample=True or unset temperature to continue." on the " model = transformers.AutoModelForCausalLM.from_pretrained(
pretrained_model_name_or_path,
torch_dtype=torch.float16,
trust_remote_code=True,
use_auth_token=True,
)" line for the meta-llama/Llama-2-7b-chat-hf model. I don’t set temperature anywhere in the script.
Does anyone have any idea what the issue is?

1 Like

try setting the temperature parameter to 0.1 while initialising the hugging face pipeline

ValueError: do_sample is set to False. However, temperature is set to 0.9 – this flag is only used in sample-based generation modes. Set do_sample=True or unset temperature to continue.

my code is:

from transformers import GenerationConfig, LlamaForCausalLM, LlamaTokenizer

MODEL_NAME = “meta-llama/Llama-2-7b-chat-hf”

bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type=“nf4”,
bnb_4bit_compute_dtype=torch.bfloat16,
)

model = LlamaForCausalLM.from_pretrained(
MODEL_NAME,
device_map=“auto”,
trust_remote_code=True,
use_auth_token=True,
temperature=0.1,
do_sample=True,
quantization_config=bnb_config,
)

tokenizer = LlamaTokenizer.from_pretrained(MODEL_NAME)
tokenizer.pad_token = tokenizer.eos_token

i explicitly change the do_sample=True in configuration_utils.py but didn`t worked

I am having the same issue. It was working without problem until last night.
I tried to change the config file and update it by adding do_sample=true but did not work.

!pip install -qqq bitsandbytes --progress-bar off

!pip install -qqq torch --progress-bar off

!pip install -q -U git+https://github.com/huggingface/transformers.git

MODEL_NAME = “meta-llama/Llama-2-7b-chat-hf”

model = AutoModelForCausalLM.from_pretrained(MODEL_NAME, device_map=“auto”, quantization_config=bnb_config)

Error msg:

ValueError Traceback (most recent call last)
in <cell line: 1>()
----> 1 model = AutoModelForCausalLM.from_pretrained(MODEL_NAME, device_map=“auto”, quantization_config=bnb_config)

5 frames
/usr/local/lib/python3.10/dist-packages/transformers/generation/configuration_utils.py in validate(self)
352 )
353 if self.temperature != 1.0:
→ 354 raise ValueError(
355 greedy_wrong_parameter_msg.format(flag_name=“temperature”, flag_value=self.temperature)
356 )

ValueError: do_sample is set to False. However, temperature is set to 0.9 – this flag is only used in sample-based generation modes. Set do_sample=True or unset temperature to continue.

For anyone looking for a solution, it was an issue with the latest release of hugging face transformers released recently. Please downgrade to the previous version !pip install git+https://github.com/huggingface/transformers@v4.31-release to fix the issue.

1 Like

Thank you.

Downgrading made it work :+1:

Thanks It works

I was using autotrain and got the same error but downgrading transformers didn’t solve the issue.

!pip install huggingface_hub autotrain-advanced
!pip install git+https://github.com/huggingface/transformers@v4.31-release

This is how I installed the package,
Any solution for this