Difference of performance when finetuning bert use the huggingface or the google official code

Gordon · June 20, 2022, 5:24pm

Hi all,
I tend to finetune BERT with a simple text classification task.
However, I got different results when using the huggingface library (torch 1.8.1+cu111) and google’s official code (Tensorflow v1.15).
I wonder if there is any optimization in huggingface for fine-tuning bert?

By the way, I believe that I use the same hyper-parameters. But I got the higher performance using the huggingface library.

I have checked the details in a similar issue:

However, it does not help me.

Topic		Replies	Views
Is Google's official BERT model and huggingface BERT model different or same? Beginners	1	1224	March 9, 2022
BERT performs worse than other implementations? 🤗Transformers	0	779	July 24, 2020
Advice to speed and performance 🤗Transformers	4	7216	December 7, 2020
Trainer class optimization for transformer models Models	0	419	January 8, 2022
Not using GPU although it is specified Course	5	31026	December 30, 2024

Difference of performance when finetuning bert use the huggingface or the google official code

Related topics