Question regarding T5ForConditionalGeneraton loss in the example

Hi, I’m just wondering what loss is in T5ForConditionalGeneration.

In the example in doc, the sentence is “The cute dog walks in the park”, and “cute dog” and “the” are masked.

In this case, does the loss indicate how far the prediction by T5 is from “cute dog” and “the”?

Thank you very much!