hi all,
I want to do hyper parameter tuning and reload my model in a loop. I have realized that if I load the model subsequently like below, it is not the same model that is loaded after calling it the second time the weights are differently initialized. however, in each execution the first one is always the same model and the subsequent ones are also the same, but the first one is always != the second one and so on.
I’m going crazy. what is going on here?
model = BertForSequenceClassification.from_pretrained(MODEL, num_labels=len(label2id), id2label=id2label, label2id=label2id,
output_attentions=False, output_hidden_states=False)
model.save_pretrained('./model1/')
model = BertForSequenceClassification.from_pretrained(MODEL, num_labels=len(label2id), id2label=id2label, label2id=label2id,
output_attentions=False, output_hidden_states=False)
model.save_pretrained('./model2/')
If I run this, the weights are not the same.
import torch
model1 = BertForSequenceClassification.from_pretrained('./model1/')
model2 = BertForSequenceClassification.from_pretrained('./model2/')
for p1, p2 in zip(model1.parameters(), model2.parameters()):
if not torch.allclose(p1, p2):
print("Weights are not the same.")
break
else:
print("Weights are the same.")
I’ve set every imaginable thing to be deterministic:
random.seed(seed_val)
np.random.seed(seed_val)
torch.manual_seed(seed_val)
torch.cuda.manual_seed_all(seed_val)
os.environ["CUBLAS_WORKSPACE_CONFIG"] = ":4096:8"
torch.use_deterministic_algorithms(True)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
any help is appreciated!