Hi,
I’m using the following code to load a pre trained module:
model_name = "samwit/koala-7b"
tokenizer = LlamaTokenizer.from_pretrained(model_name,)
base_model = LlamaForCausalLM.from_pretrained(model_name,
load_in_8bit=True,
device_map='auto',
output_hidden_states=True,)
And then using the following code, on 2 different machines:
input_ids = tokenizer("AAA", return_tensors="pt")
output = base_model(**input_ids)
output.hidden_states[-1].mean(dim=1).squeeze().tolist()[0]
On one machine I get: -0.328857421875, and on another I get: -0.28759765625.
The input_ids are the same, but the output gets different values.
Setting a manual_seed
didn’t do any difference.
Does this make sense? shouldn’t this be deterministic?
Thank you