Set_seed is not working

nasos10 · September 8, 2024, 11:36am

Okay, it’s my first post here and I am not a really experienced programmer so forgive me if I am asking something trivial :)) .

I am working on finetuning LLM’s on a downstream task, and I use Accelerate library to work on 2 GPUs (DDP setting). I can’t share the code rn, but my setup looks like the tutorial in run_native_loop

I am trying to set_seed with the transformers method, to have reproducible experiments, and I am altering the seed to have 3 different runs. The problem is that, the data are loaded in batches with the same way (meaning that the first batch is always the same, in both GPUs. After shuffling, in epoch 2 i get shuffled training samples but still I get the same ones, even when altering the seed. This is something i cannot explain. How is this even possible?
The paradox is that setting the seed works fine for things like adding new randomly initialized rows in the embedding layer (different seed=different initialization), but the data keeps loading in the same order.
Is this the way it is supposed to be? It seems really strange to me. Maybe accelerator creates the problem? or is it the way Dataloaders are wrapped for the DDP scenario.

To clarify, data is indeed shuffled before a new epoch. What drives me crazy is that it is still shuffled in the same way, I alter the seed and i get the same samples order in epoch 2.

John6666 · September 8, 2024, 12:11pm

I am not familiar with LLM, but maybe this is it?

github.com/huggingface/transformers

transformers.set_seed seems to do nothing

opened 06:30PM - 08 May 23 UTC

closed 11:19AM - 13 Jun 23 UTC

mojejmenojehonza

### System Info - `transformers` version: 4.28.0.dev0 - Platform: Linux-5.19.0…-41-generic-x86_64-with-glibc2.35 - Python version: 3.10.10 - Huggingface_hub version: 0.13.3 - Safetensors version: not installed - PyTorch version (GPU?): 2.0.0 (True) - Tensorflow version (GPU?): not installed (NA) - Flax version (CPU?/GPU?/TPU?): not installed (NA) - Jax version: not installed - JaxLib version: not installed - Using GPU in script?: Yes - Using distributed or parallel set-up in script?: No ### Who can help? @gante, @younesbelkada ### Information - [ ] The official example scripts - [X] My own modified scripts ### Tasks - [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...) - [X] My own task or dataset (give details below) ### Reproduction 1. install transformes with support for alpaca model 2. run this code with one seed 3. run this code with any other seed 4. see that the results are the same ``` from transformers import GenerationConfig, LlamaTokenizer, LlamaForCausalLM, set_seed from torch import float16, compile, no_grad set_seed(621) # Enhances prompt def enhance_prompt(prompt, input=None): if input: return f"""Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. ### Instruction: {prompt} ### Input: {input} ### Response:""" else: return f"""Below is an instruction that describes a task. Write a response that appropriately completes the request. ### Instruction: {prompt} ### Response:""" # Gets response from Alpaca def get_response(prompt): with no_grad(): outputs = alpaca.generate(input_ids=tokenizer(prompt, return_tensors="pt").input_ids.to("cuda"), generation_config=generation_config, return_dict_in_generate=True, output_scores=True) outputs = tokenizer.decode(outputs.sequences[0], skip_special_tokens=True) return outputs.split("### Response:")[1] # Sets up Alpaca tokenizer = LlamaTokenizer.from_pretrained("chainyo/alpaca-lora-7b") alpaca = LlamaForCausalLM.from_pretrained("chainyo/alpaca-lora-7b", load_in_8bit=True, torch_dtype=float16, device_map="auto") generation_config = GenerationConfig(temperature=0.2, top_p=0.75, top_k=40, num_beams=4, max_new_tokens=64) alpaca.eval() compile(alpaca) # Gets output from Alpaca prompt = enhance_prompt("Write a simple poem about flowers.") out = get_response(prompt) # Prints Alpaca's output print(out) ``` ### Expected behavior Model will output two different answers but now it gives the same every seed I try.

Two notes:

You should pass do_sample=True in your generation config or in your .generate() call. Most models have it off by default, causing the generation to be deterministic (and ignoring parameters like temperature, top_k, etc).

With temperature=0.2, the relative weight of the most likely logits is massively increased, making generation almost deterministic. Even if there are no bugs in your script, it’s far from guaranteed that two different seeds produce different outputs with such low temperature

nasos10 · September 8, 2024, 12:22pm

Well, thank you for your response but I don’t think this is the case. My code is only for training, while the abovementioned post was about the generate function (used at inference). The dataloader is the problem with my code, but i can’t see why.

Topic		Replies	Views
Why does Transformer (LLaMa 3.1-8B) give different logits during inference for the same sample when used with single versus multi gpu prediction? 🤗Accelerate	0	99	September 20, 2024
Same seed across different gpus in multiple workers Intermediate	0	274	March 8, 2024
GPT2 Generated Output Always the Same? Beginners	3	5704	December 16, 2020
Random seed for weight initialization and data order 🤗Transformers	0	1237	February 21, 2022
Making data loader deterministic Beginners	0	334	April 25, 2023

Set_seed is not working

Related topics