I have been trying to use BertModel, albert and GPT2 models for fine tuning on my regression task and i was able to produce unwanted results . i will mention it below:
I tried it two times:
-
I used CLS token embeddings and fine tuned over my entire custom model but it produced some random number repeating over and over in my output matrix space.
-
I simply passed CLS token embeddings to the feed forward NN. In this case also it produced some random number.
what can be the solution to this problem?
class Custom_GPT(tf.keras.Model):
def __init__(self,embedding_dim):
super(Custom_GPT,self).__init__()
self.embedding_dim=embedding_dim
self.dense=tf.keras.layers.Dense(1,input_shape=(embedding_dim,),activation=None,name='dense_layer_1')
self.GPT_layer=GPT_model
def call(self,input_ids):
sequence=self.GPT_layer(input_ids)[0]
cls=sequence[:,0,:]
x=self.dense(cls)
model doesn’t seem to be learning anything here. it generates a random constant repeatedly