Best models for seq2seq tasks

Hi all, newbie here! So I have understood that transformers stand out a lot for seq2seq tasks since they are much faster to train and are more powerful in their comprehension abilities.
However, I had a particular use-case where I want to train a model from scratch. Basically, it involves a dataset of ciphers and the model will have to decode the plaintext from the encrypted value.

Now The cipher is pretty advanced, so the relationship would obviously not be something very simple. The length of the output sequence will be variable. So can anyone point me to a model that preferably uses transformers and that is pretty good for finding out complex relations between the sequences? It does not have to be compulsorily from the HF transformers library. Any model that you all think is particularly good for this purpose will be considered. Also, please also give an explanation of why you recommend that model so that I can research further about it.

Cheers!
Neel Gupta

how long are the sequences? 10+ words, 100+ words, 1000+ words?

How much does length matter? what are the corresponding recommandations for each length range?

@chrisdoyleIE @mengyahu The input sequence should be about 40 characters max. BTW how much does the length of a sequence matter in such a model? Also, the corresponding output would not be more that 7 chars according to my dataset…