Implementing the REINFORCE algorithm for encoder-decoder model

aseemarora · September 12, 2021, 8:48pm

I am trying to train the encoder-decoder model (EncoderDecoderModel) using the REINFORCE algorithm. For that, I need to decode using the random sampling method (do_sample=True) and use the logits. I can’t find a way to get the logits for the sampled tokens as, in the training mode, the decoding used is greedy_decoding. The way to apply random sampling is only through the generate method but I couldn’t find a way to get the logits and, also, it doesn’t store the gradients.

Any workaround for it?

NeoFelix · March 14, 2022, 8:22pm

Hi, did you find a way to do this?

Topic		Replies	Views
Stochastic Sampling with Trainer.evaluate() Logits Intermediate	3	319	May 6, 2024
Decoding strategy in training phase Beginners	2	705	November 25, 2022
Custom Decoding Strategy Beginners	0	458	December 6, 2023
EncoderDeocoderModel with different checkpoint training Models	0	358	January 24, 2023
BERT2RND EncoderDecoderModel predicts random words for Translation tasks 🤗Transformers	0	379	May 30, 2022

Implementing the REINFORCE algorithm for encoder-decoder model

Related topics