Regression outputs (list) for normal distribution output in regression problems

Hi,

I have a question regarding the form of the predictions using a PatchTST regression model with ‘nll’ as loss.
I create a regression model as follows:

model = PatchTSTForRegression(arch_config)

and train it successfully on some data.

If I use the mse as loss as here:

arch_config = PatchTSTConfig(
	...
	distribution_output='normal',
	loss='mse',
)

the prediction
predicted_values = model(x).regression_output

returns a tensor predicted_values of size ‘batch_size’ x ‘output_features’, which makes perfectly sense. In my understanding, the distribution_output parameter is ignored in that case.

However, if I change that to a ‘nll’ loss with a ‘normal’ distribution:

arch_config = PatchTSTConfig(
	...
	distribution_output='normal',
	loss='nll',
)

the same prediction returns a list of two tensors of the same shape ‘batch_size’ x ‘output_features’.
I just do not understand what these two entries in the list correspond to.
Are they somehow a mean and a (diagonal) covariance matrix of a normal distribution?
Am I getting something completely wrong?
What are the two list entries exactly?
I have not found any helpful clues in the documentation or forum.

Thank you and best regards!

Since there are no answers, I would like to re-phrase my question to a more general one:

Using a PatchTSTForRegression with
distribution_output='normal' loss and
loss='nll'
in the PatchTSTConfig, the
PatchTSTForRegressionOutput.regression_outputs
(e.g. when predicting values with a trained model) is a tuple of length two. Each entry of the tuple is a tensor of size (batch_size, num_targets).

How can we interpret this tuple / the values of the tuple?

Any help, reference or suggestion is appreciated!

Yes, it’s loc and scale params of Noemal distribution.
transformers\time_series_utils.py for more information
179 line
class NormalOutput(DistributionOutput):
“”"
Normal distribution output class.
“”"

args_dim: Dict[str, int] = {"loc": 1, "scale": 1}
distribution_class: type = Normal

@classmethod
def domain_map(cls, loc: torch.Tensor, scale: torch.Tensor):
    scale = cls.squareplus(scale).clamp_min(torch.finfo(scale.dtype).eps)
    return loc.squeeze(-1), scale.squeeze(-1)