Getting random results with BERT

dorood · April 25, 2021, 6:29pm

Hi
I have modified a BERT model a bit and adds small “Linear” layers between its layer, the only random part is random initalization done for these layers as below:

W = torch.nn.init.xavier_normal_(tensor, gain=math.sqrt(2))

I have put these initialization when defining each layers. I am getting each time 3-4% difference, and really appreciate your help to fix this issue.

Could you please help me on how I should handle initialization on top of a BERT model, should them all go inside __init_weights(), would that differ if one does it inside this function or anywhere in the model?
Huggingface run_glue.py fix the random seeds on top of all the lines, shall I redo it each time for the initialization?

I am really struggling with this issue and appreciate your help a lot.
@sgugger @stas

dorood · April 25, 2021, 10:49pm

Hi
I confirm the same issue happens also for the BERT model without any modifications, for this I run it on MRPC for 3 epochs, here is the two results:

[INFO|trainer_pt_utils.py:727] 2021-04-26 00:46:01,542 >>   epoch                        =                3.0
[INFO|trainer_pt_utils.py:727] 2021-04-26 00:46:01,542 >>   eval_average_metrics         = 0.8355071710663065
[INFO|trainer_pt_utils.py:727] 2021-04-26 00:46:01,542 >>   eval_mem_cpu_alloc_delta     =                0MB
[INFO|trainer_pt_utils.py:727] 2021-04-26 00:46:01,543 >>   eval_mem_cpu_peaked_delta    =                1MB
[INFO|trainer_pt_utils.py:727] 2021-04-26 00:46:01,543 >>   eval_mem_gpu_alloc_delta     =                0MB
[INFO|trainer_pt_utils.py:727] 2021-04-26 00:46:01,543 >>   eval_mem_gpu_peaked_delta    =              264MB
[INFO|trainer_pt_utils.py:727] 2021-04-26 00:46:01,543 >>   mrpc_eval_accuracy           =             0.8088
[INFO|trainer_pt_utils.py:727] 2021-04-26 00:46:01,543 >>   mrpc_eval_combined_score     =             0.8355
[INFO|trainer_pt_utils.py:727] 2021-04-26 00:46:01,543 >>   mrpc_eval_f1                 =             0.8622
[INFO|trainer_pt_utils.py:727] 2021-04-26 00:46:01,543 >>   mrpc_eval_loss               =             0.5017
[INFO|trainer_pt_utils.py:727] 2021-04-26 00:46:01,543 >>   mrpc_eval_runtime            =         0:00:00.31
[INFO|trainer_pt_utils.py:727] 2021-04-26 00:46:01,543 >>   mrpc_eval_samples_per_second =            656.083

and

[INFO|trainer_pt_utils.py:722] 2021-04-26 00:46:42,272 >> ***** test metrics *****
[INFO|trainer_pt_utils.py:727] 2021-04-26 00:46:42,272 >>   epoch                        =                3.0
[INFO|trainer_pt_utils.py:727] 2021-04-26 00:46:42,272 >>   eval_average_metrics         = 0.8656245715069244
[INFO|trainer_pt_utils.py:727] 2021-04-26 00:46:42,272 >>   eval_mem_cpu_alloc_delta     =                0MB
[INFO|trainer_pt_utils.py:727] 2021-04-26 00:46:42,272 >>   eval_mem_cpu_peaked_delta    =                2MB
[INFO|trainer_pt_utils.py:727] 2021-04-26 00:46:42,272 >>   eval_mem_gpu_alloc_delta     =                0MB
[INFO|trainer_pt_utils.py:727] 2021-04-26 00:46:42,272 >>   eval_mem_gpu_peaked_delta    =              264MB
[INFO|trainer_pt_utils.py:727] 2021-04-26 00:46:42,272 >>   mrpc_eval_accuracy           =             0.8431
[INFO|trainer_pt_utils.py:727] 2021-04-26 00:46:42,272 >>   mrpc_eval_combined_score     =             0.8656
[INFO|trainer_pt_utils.py:727] 2021-04-26 00:46:42,272 >>   mrpc_eval_f1                 =             0.8881
[INFO|trainer_pt_utils.py:727] 2021-04-26 00:46:42,272 >>   mrpc_eval_loss               =             0.4185
[INFO|trainer_pt_utils.py:727] 2021-04-26 00:46:42,272 >>   mrpc_eval_runtime            =         0:00:00.32
[INFO|trainer_pt_utils.py:727] 2021-04-26 00:46:42,272 >>   mrpc_eval_samples_per_second =            623.473

This now looks to me this is a library issue, @sgugger really appreciate your comments on this issue. thanks

sgugger · April 26, 2021, 11:55am

Please do not at-mention moderators of the forum in every single one of your message.

dorood · April 27, 2021, 9:43am

Hi
Sure, the issue resolved with upgrading to 4.6.0 dev version of transformers. thank you

Topic		Replies	Views
Comparing output of BERT model - why do two runs differ even with fixed seed? Beginners	2	645	January 18, 2022
Initializing the weights of the final layer of e.g. BertForTokenClassification with a manual seed 🤗Transformers	2	7921	October 6, 2020
Ensuring Consistency in Results: A Focus on Reproducibility BERT 🤗Transformers	2	85	October 3, 2024
High inconsistancies while Training Beginners	0	250	July 29, 2022
Different BERT results Beginners	1	1174	May 25, 2022

Getting random results with BERT

Related topics