Should we use tie_weights always? (intend to separate source/target embedding)


I’m developing a machine translation model(Seq2seq), which have different source and target tokenizer. (As a beginner, I even do not know if this is the correct way to solve the machine translation problem.)

The problem happens when I want to divide shared(input, output) embedding into source/target embedding (I’m using T5ConditionalGeneration). After that, there is a function self.init_weights() and here they operate self.tie_weights() method.

As far as I know, the self.tie_weights() make output embeddings the same as input embeddings. But in my case, then target embedding will be changed and cause indexing error.

My question is:

  1. Should I call self.tie_weights() in this case?
  2. Is there any better approach that I can attempt?

Here is the related post below:

Thanks in advance for any answers!