GPT2 Training History?

Solamino · April 15, 2022, 6:16am

Hello.

I’m trying to train a GPT2 model (actually GPT2LMHeadModel) using tensorflow2.

In this post the author shows how to train GPT2 in a new language in great detail. By following his guide I was able to train a new non-GPT2 model from scratch. However, after the training is done I couldn’t visualize the training result. Not because of error but I don’t know-how.

after defining the optimizer, loss functions, and the metrics

# defining our optimizer
optimizer = tf.keras.optimizers.Adam(learning_rate=3e-5, epsilon=1e-08, clipnorm=1.0)
# definining our loss function
loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
# defining our metric which we want to observe
metric = tf.keras.metrics.SparseCategoricalAccuracy('accuracy')
# compiling the model
model.compile(optimizer=optimizer, loss=[loss, *[None] * model.config.n_layer], metrics=[metric])

I start Training

num_epoch = 10
history = model.fit(dataset, epochs=num_epoch)

Now, my question is

It only uses a training dataset. how do I evaluate it? didn’t I need a validation dataset? if yes. how do I feed the validation dataset to model.fit()
how do I interpret the training history. I want to draw a graph of training loss and accuracy with validation loss and accuracy.

Thank you for your time.

Topic		Replies	Views
Train GPT2 from scratch (Tensorflow) - Loss function 🤗Transformers	1	2081	July 21, 2021
Train GPT2 from scratch (Tensorflow) - Loss function issue Beginners	0	718	March 11, 2021
Training GPT2 From Scratch in TensorFlow (TFGPT2) with generators Beginners	1	792	May 14, 2022
Finetuning GPT2 with user defined loss Beginners	56	16072	July 23, 2023
GPT2 with TensorFlow? 🤗Transformers	1	370	November 14, 2020

GPT2 Training History?

Related topics