Hello.
I’m trying to train a GPT2 model (actually GPT2LMHeadModel) using tensorflow2.
In this post the author shows how to train GPT2 in a new language in great detail. By following his guide I was able to train a new non-GPT2 model from scratch. However, after the training is done I couldn’t visualize the training result. Not because of error but I don’t know-how.
after defining the optimizer, loss functions, and the metrics
# defining our optimizer
optimizer = tf.keras.optimizers.Adam(learning_rate=3e-5, epsilon=1e-08, clipnorm=1.0)
# definining our loss function
loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
# defining our metric which we want to observe
metric = tf.keras.metrics.SparseCategoricalAccuracy('accuracy')
# compiling the model
model.compile(optimizer=optimizer, loss=[loss, *[None] * model.config.n_layer], metrics=[metric])
I start Training
num_epoch = 10
history = model.fit(dataset, epochs=num_epoch)
Now, my question is
-
It only uses a training dataset. how do I evaluate it? didn’t I need a validation dataset? if yes. how do I feed the validation dataset to model.fit()
-
how do I interpret the training history. I want to draw a graph of training loss and accuracy with validation loss and accuracy.
Thank you for your time.