Hello
I’m developing a machine translation model(Seq2seq), which have different source and target tokenizer. (As a beginner, I even do not know if this is the correct way to solve the machine translation problem.)
The problem happens when I want to divide shared(input, output) embedding into source/target embedding (I’m using T5ConditionalGeneration). After that, there is a function self.init_weights() and here they operate self.tie_weights() method.
As far as I know, the self.tie_weights() make output embeddings the same as input embeddings. But in my case, then target embedding will be changed and cause indexing error.
My question is:
- Should I call self.tie_weights() in this case?
- Is there any better approach that I can attempt?
Here is the related post below:
Thanks in advance for any answers!