NLP course: the reason the model outputs logits - question

nirkov · August 29, 2023, 7:48am

According to the explanation here -

all Transformers models output the logits, as the loss function for training will generally fuse the last activation function, such as SoftMax, with the actual loss function, such as cross-entropy

Can anyone please explain that? What does it mean to “fuse the last activation function, such as SoftMax, with the actual loss function…” ?

Topic		Replies	Views
Transformers replacing loss function 🤗Transformers	0	3387	March 26, 2022
Returning logits from Trainer.predict() Beginners	3	2400	August 31, 2021
How to use the multiple output of the model while calling Trainer 🤗Transformers	0	490	August 10, 2021
Confusion about trainer.predict(dataset['test']) output 🤗Transformers	0	533	November 3, 2022
How to convert model output logits into string sentences during training to check what the model is outputting? 🤗Transformers	3	5282	October 14, 2021

NLP course: the reason the model outputs logits - question

Related topics