Can hidden states be passed instead of input_ids or inputs_embeds in Transformers OpenAI GPT2

rupayan178 · July 6, 2021, 8:50am

I am working on an encoder decoder model which uses a fine tuned RoBERTa classifier as the encoder and GPT2 as the decoder. Before passing the encoder context to the decoder, I am mixing it with some context from a different domain. This mixing module is a simple NN. Hence, I now want to pass these transformed hidden states to the GPT2 decoder to do decoding, and I will train the decoder and the mixer only, not the encoder. How can I pass these transformed hidden states to the GPT2 decoder instead of the input_ids or inputs_embeds ? The shape of my transformed hidden states is (n_layers, batch_size, sequence_length, hidden_size) where I am currently using batch_size=1, and the sequence_length is 1 because I took only the [CLS] token hidden states of the encoder. Any help will be appreciated.

Topic		Replies	Views
How to add encoder's last hidden state to GPT2 as encoder-decoder attention Beginners	0	381	January 31, 2023
GPT-GPT encoder decoder 🤗Transformers	0	294	May 4, 2021
Conditional generation from hidden states only Beginners	0	355	February 22, 2022
Returned Tensors and Hidden State Beginners	4	2711	September 5, 2020
Equivalent of `inputs_embeds` for `FlaxGPT2Model` 🤗Transformers	0	255	August 12, 2021

Can hidden states be passed instead of input_ids or inputs_embeds in Transformers OpenAI GPT2

Related topics