Regression outputs (list) for normal distribution output in regression problems

member67 · May 15, 2024, 1:16pm

Hi,

I have a question regarding the form of the predictions using a PatchTST regression model with ‘nll’ as loss.
I create a regression model as follows:

model = PatchTSTForRegression(arch_config)

and train it successfully on some data.

If I use the mse as loss as here:

arch_config = PatchTSTConfig(
	...
	distribution_output='normal',
	loss='mse',
)

the prediction
predicted_values = model(x).regression_output

returns a tensor predicted_values of size ‘batch_size’ x ‘output_features’, which makes perfectly sense. In my understanding, the distribution_output parameter is ignored in that case.

However, if I change that to a ‘nll’ loss with a ‘normal’ distribution:

arch_config = PatchTSTConfig(
	...
	distribution_output='normal',
	loss='nll',
)

the same prediction returns a list of two tensors of the same shape ‘batch_size’ x ‘output_features’.
I just do not understand what these two entries in the list correspond to.
Are they somehow a mean and a (diagonal) covariance matrix of a normal distribution?
Am I getting something completely wrong?
What are the two list entries exactly?
I have not found any helpful clues in the documentation or forum.

Thank you and best regards!

member67 · May 22, 2024, 7:20am

Since there are no answers, I would like to re-phrase my question to a more general one:

Using a PatchTSTForRegression with
distribution_output='normal' loss and
loss='nll'
in the PatchTSTConfig, the
PatchTSTForRegressionOutput.regression_outputs
(e.g. when predicting values with a trained model) is a tuple of length two. Each entry of the tuple is a tensor of size (batch_size, num_targets).

How can we interpret this tuple / the values of the tuple?

Any help, reference or suggestion is appreciated!

gribnogrib · June 13, 2024, 5:17pm

Yes, it’s loc and scale params of Noemal distribution.
transformers\time_series_utils.py for more information
179 line
class NormalOutput(DistributionOutput):
“”"
Normal distribution output class.
“”"

args_dim: Dict[str, int] = {"loc": 1, "scale": 1}
distribution_class: type = Normal

@classmethod
def domain_map(cls, loc: torch.Tensor, scale: torch.Tensor):
    scale = cls.squareplus(scale).clamp_min(torch.finfo(scale.dtype).eps)
    return loc.squeeze(-1), scale.squeeze(-1)

Topic		Replies	Views
PatchTSTForPrediction outputs 🤗Transformers	0	83	March 21, 2024
PatchTSMixerForPrediction error with prediction of len 1 🤗Transformers	2	137	February 6, 2024
Using ViTForClassification for regression? Beginners	2	1836	August 8, 2023
ValueError when using PatchTSTForClassification 🤗Transformers	1	142	May 28, 2024
Fine-Tuning results suggest some underlying implementation error? 🤗Transformers	1	680	October 5, 2021

Regression outputs (list) for normal distribution output in regression problems

Related topics