How to implement generate function for seperate encoder decoder T5 model?

I was working on optimizing the T5 model. The version of the transformer I am using is 4.8. For optimization, I separated the model into encoder and decoder with LM(Language modeling) head. Earlier for generation, I just passed input_ids, attention_mask,max_length, and num_beams. But now since the model is separated into encoder and decoder I can’t use generate function. But I am able to generate logits using the below code:

encoder_last_hidden_state = t5_trt_encoder(input_ids=input_ids)
outputs = t5_trt_decoder(input_ids, encoder_last_hidden_state)

But how can I use these logits to generate sequences?. I mean I am unclear how can I use the encoder and decoder to write a class which supports generate function?

When I am prediction in this way:

Generate sequence for an input

from transformers.generation_stopping_criteria import (
MaxLengthCriteria,
StoppingCriteriaList,
)

max_length = 64

decoder_input_ids = torch.full(
(1, 1), tokenizer.convert_tokens_to_ids(tokenizer.pad_token), dtype=torch.int32
)
encoder_last_hidden_state = t5_trt_encoder(input_ids=input_ids)

outputs = t5_trt_decoder.greedy_search(
input_ids=decoder_input_ids,
encoder_hidden_states=encoder_last_hidden_state,
stopping_criteria = StoppingCriteriaList([MaxLengthCriteria(max_length)])
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

The output is blank.
The value of outputs[0] is:
tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0])

The model I am trying to optimize is valhalla/t5-small-qa-qg-hl model from https://github.com/patil-suraj/question_generation repo and I am using using trt_torch_encoder and trt_torch_decoder from this notebook
https://github.com/NVIDIA/TensorRT/blob/main/demo/HuggingFace/notebooks/t5.ipynb