Different last hidden state output on different machines, same tokens

omriher-intel · April 22, 2023, 9:07pm

Hi,

I’m using the following code to load a pre trained module:

model_name = "samwit/koala-7b"
tokenizer = LlamaTokenizer.from_pretrained(model_name,)
base_model = LlamaForCausalLM.from_pretrained(model_name, 
                                        load_in_8bit=True,
                                        device_map='auto',
                                        output_hidden_states=True,)

And then using the following code, on 2 different machines:

input_ids = tokenizer("AAA", return_tensors="pt")
output = base_model(**input_ids)
output.hidden_states[-1].mean(dim=1).squeeze().tolist()[0]

On one machine I get: -0.328857421875, and on another I get: -0.28759765625.
The input_ids are the same, but the output gets different values.

Setting a manual_seed didn’t do any difference.

Does this make sense? shouldn’t this be deterministic?

Thank you

Topic		Replies	Views
Getting the same embedding from llama 2 class token for any input 🤗Transformers	1	1287	December 4, 2023
Outputs.hidden_states[0][-1] always returns the same logit regardless of the question Beginners	0	34	October 29, 2024
Get each generated token last layer hidden state 🤗Transformers	3	43	March 16, 2025
Results of model.generate are different for different batch sizes of the decode-only model Beginners	6	5990	April 14, 2024
Understanding model output arrays Beginners	0	616	August 28, 2022

Different last hidden state output on different machines, same tokens

Related topics