Question about the output of the decision transformer

Pulsar110 · December 11, 2023, 3:28pm

From the code in here: transformers/src/transformers/models/decision_transformer/modeling_decision_transformer.py at v4.35.2 · huggingface/transformers · GitHub

        # reshape x so that the second dimension corresponds to the original
        # returns (0), states (1), or actions (2); i.e. x[:,1,t] is the token for s_t
        x = x.reshape(batch_size, seq_length, 3, self.hidden_size).permute(0, 2, 1, 3)

        # get predictions
        return_preds = self.predict_return(x[:, 2])  # predict next return given state and action
        state_preds = self.predict_state(x[:, 2])  # predict next state given state and action
        action_preds = self.predict_action(x[:, 1])  # predict next action given state

I’m not sure I understand why self.predict_return(x[:, 2]) or self.predict_state(x[:, 2]) is predicting the return/next state given the state and action. From the comment on the top, x[:, 2] is only the action? Am I missing something?

And if this code is correct, what is the use of x[:, 0]?

Topic		Replies	Views
Decision Transformer a question about the tutorial 🤗Transformers	0	127	April 15, 2024
Decision Transformer for Discrete action Beginners	5	418	December 7, 2024
Understanding the Decision Transformer 🤗Transformers	0	137	May 25, 2024
Huggingface DecisionTransformer - Reward Calculation Beginners	0	234	September 15, 2022
Transformer "output_hidden_states" format 🤗Transformers	3	698	July 9, 2023

Question about the output of the decision transformer

Related topics