Conditional generation from hidden states only


I’m working on a relatively simple problem where I would like to

  1. train an encoder-decoder (seq2seq) architecture to obtain a latent space on my own corpus of sequences
  2. sample from the latent space (ignoring the encoder entirely), and feed that into the decoder, and get an output sequence

I’ve noticed from the docs that input_ids are always required for the decoder (i.e. when using .generate), which makes sense when you have an input prompt (or even a single token to start from). However, in my case, I don’t have an input prompt, only a hidden state.

Is this workflow something that’s easily possible within Transformers? or perhaps only within a specific architecture?