Implementing the REINFORCE algorithm for encoder-decoder model

Hi, did you find a way to do this?