Accuracy changes dramatically

I tried to fine tune a bert model for text classification task using same parameters(learning rate, warmup step, batch size, number of epoch) in pytorch and tensorflow. If I use tensorflow, the validation accuracy changes dramatically. In pytorch accuracy is around %96, in tensorflow %76. One thing I noticed is the gpu memory usage difference (pytorch: ~12gb, tf ~8gb). Shouldn’t we expect it to be the similar accuracy?

  • transformers version: 3.5.1
  • Platform: Linux-4.19.112±x86_64-with-Ubuntu-18.04-bionic
  • Python version: 3.6.9
  • PyTorch version (GPU?): 1.7.0+cu101 (True)
  • Tensorflow version (GPU?): 2.3.0 (True)
  • Using GPU in script?: Yes
  • Using distributed or parallel set-up in script?: No
from transformers import TFBertForSequenceClassification

model = TFBertForSequenceClassification.from_pretrained('bert-base-uncased', num_labels = num_labels)

optimizer = tf.keras.optimizers.Adam(learning_rate=lr_schedule) 
model.compile(optimizer=optimizer, loss=model.compute_loss, metrics=['accuracy']) 
history =, epochs=epochs, batch_size=32)