Hi,
I have a question regarding the form of the predictions using a PatchTST regression model with ‘nll’ as loss.
I create a regression model as follows:
model = PatchTSTForRegression(arch_config)
and train it successfully on some data.
If I use the mse as loss as here:
arch_config = PatchTSTConfig(
...
distribution_output='normal',
loss='mse',
)
the prediction
predicted_values = model(x).regression_output
returns a tensor predicted_values of size ‘batch_size’ x ‘output_features’, which makes perfectly sense. In my understanding, the distribution_output parameter is ignored in that case.
However, if I change that to a ‘nll’ loss with a ‘normal’ distribution:
arch_config = PatchTSTConfig(
...
distribution_output='normal',
loss='nll',
)
the same prediction returns a list of two tensors of the same shape ‘batch_size’ x ‘output_features’.
I just do not understand what these two entries in the list correspond to.
Are they somehow a mean and a (diagonal) covariance matrix of a normal distribution?
Am I getting something completely wrong?
What are the two list entries exactly?
I have not found any helpful clues in the documentation or forum.
Thank you and best regards!