TFOpenAIGPTDoubleHeadsModel Loss Function

bflassing · August 6, 2020, 2:17am

Hello,

I recently read the article: https://medium.com/huggingface/how-to-build-a-state-of-the-art-conversational-ai-with-transfer-learning-2d818ac26313

As an exercise I was trying to convert this all to use tensorflow instead of pytorch. I seem to be missing something, and I am sure it is a gap in my knowledge. Everything seems to be pretty straight forward except for calculating the loss in the loss function.

In the article it states, “The total loss will be the weighted sum of the language modeling loss and the next-sentence prediction loss”

Now the pytorch version of OpenAIGPTDoubleHeadsModel returns the both loss values in the “call” function. But the Tensorflow version, TFOpenAIGPTDoubleHeadsModel, does not. Does anyone have any knowledge, or the experience to go about calculating the loss from the TFOpenAIGPTDoubleHeadsModel model during training? The TF model doesn’t even take the labels which the pytroch version uses to calculate the loss.

Thank you for any input.
Brett

Topic		Replies	Views
Train GPT2 from scratch (Tensorflow) - Loss function 🤗Transformers	1	2086	July 21, 2021
Train GPT2 from scratch (Tensorflow) - Loss function issue Beginners	0	718	March 11, 2021
Loss from calling model and computing explicitly don't match Beginners	0	211	June 30, 2023
Can one simply calculate loss (given labels) with Inference API? Beginners	0	355	March 7, 2022
Question about loss computing in training masked-language-model Beginners	0	327	March 17, 2022

TFOpenAIGPTDoubleHeadsModel Loss Function

Related topics