I am reading the Decision Transformer research paper and experimenting with the Decision Transformer’s API.
The Decision Transformer research paper outlines that it utilizes three categories of inputs, namely state, action, and return-to-go. Nevertheless, the Huggingface implementation requires an additional input, immediate reward (rewards), which is not employed in the forward function.
I conducted a trial by inputting a random value for rewards, but the outcome remained unchanged.
I believe that the rewards input is redundant in the forward function argument and may cause confusion, and therefore suggest its removal from the code.
As I am a novice in reinforcement learning, there is a possibility of errors in my analysis, and I would appreciate any corrections or feedback.