Unsupervised Code-Code Translation based on TransCoder

reshinthadith · June 29, 2021, 3:34pm

Hello, Patrick. This is Reshinth. The original paper, trained the same on 3 languages(Java, C++ and Python) and their monolingual corpora. So they’ll have 3 Encoders and 3 Decoders. To make it feasible and reduce the complexity, shall we reduce the entire problem to 1 way Translation b/w just two languages. Say (Python to C++) ? This would require us to have 1 Encoder and 1 Decoder. Let me know what you think.
So, the end objective is to have a Seq2Seq Model which can translate from py2cpp, trained with no parallel corpora.
And yes we can use FlaxRoBERTa for the Encoder and the Decoder.

Topic		Replies	Views
PreTrain RoBERTa/T5 from scratch for Programming Languages Flax/JAX Projects	27	3470	April 16, 2023
Finetuning a model for machine translation on a programming language Models	1	650	November 29, 2023
Pretrain GPT-Neo for Open Source GitHub Copilot Model Flax/JAX Projects	54	24013	January 18, 2022
Fine-tuning translator based on a single language Intermediate	0	289	September 22, 2021
Decoder only fine-tuning enough for UMT5 Models	0	337	November 29, 2023

Unsupervised Code-Code Translation based on TransCoder

Related topics