Regression is failing in fine tuning with BERT/GPT-2/Albert

I have been trying to use BertModel, albert and GPT2 models for fine tuning on my regression task and i was able to produce unwanted results . i will mention it below:

I tried it two times:

  1. I used CLS token embeddings and fine tuned over my entire custom model but it produced some random number repeating over and over in my output matrix space.

  2. I simply passed CLS token embeddings to the feed forward NN. In this case also it produced some random number.

what can be the solution to this problem?

class Custom_GPT(tf.keras.Model):

  def __init__(self,embedding_dim):





  def call(self,input_ids):




model doesn’t seem to be learning anything here. it generates a random constant repeatedly

Are you returning x after call? I am not familiar with Tensorflow, but I assume that you still have to return the final logits? Otherwise it will implicitly return None.