Decision Transformer for Discrete action

I’m studying the decision transformer referring to Train your first Decision Transformer.
In the post, the example is for “halfcheetah” ( action space is continuous) and
the following model code is used.
I’m trying to apply this to the discrete action space.
I added the logit layer for the discrete action and changed the loss function as below.
( red color: removed , blue color: added )
Is this the right approach?

Hey,

I’m also trying out something similar recently. Based on my implementation, your implementation looks almost similar to mine, except for the fact that I did not use an additional linear layer and directly used the outputs as the logit. Also, I had encode my actions into one-hot-encoding, so I had to do some reshaping with the action targets, but otherwise, I think this is pretty much spot on.

I usually just print out the shapes and the intermediate variables at least once for a sanity check to make sure everything looks right and nothing it broadcasted incorrectly.

Hey!
and in the original_forward, why did you define the action_targets? Because I understand that the logits are the action_preds, no?

Yeah, it does seem like the action_targets are technically not needed in the original_forward() function which is used during test mode, and I don’t see it being used in the code snippet either.

1 Like

Thank you for your quick response! :smile: I’m using DT for my bachelor degree final thesis and I’m a bit lost. Seeing that you have been using it for a longer time, I hope you don’t mind if I ask you some questions.

I have a case very similar to the one you mentioned, with discrete actions and using one-hot encoding. So, does this code look good in your opinion?

I’m also at a bit of a loss as to which activation function best fits this type of problem and certain model configuration parameters. How do you see these?

Thanks in advanced!!!

1 Like

Thanks for sharing. It helps me a lot.