Hello, Patrick. This is Reshinth. The original paper, trained the same on 3 languages(Java, C++ and Python) and their monolingual corpora. So they’ll have 3 Encoders and 3 Decoders. To make it feasible and reduce the complexity, shall we reduce the entire problem to 1 way Translation b/w just two languages. Say (Python to C++) ? This would require us to have 1 Encoder and 1 Decoder. Let me know what you think.
So, the end objective is to have a Seq2Seq Model which can translate from py2cpp, trained with no parallel corpora.
And yes we can use FlaxRoBERTa for the Encoder and the Decoder.