Should we use tie_weights always? (intend to separate source/target embedding)

jucho2725 · November 17, 2021, 3:35am

Hello

I’m developing a machine translation model(Seq2seq), which have different source and target tokenizer. (As a beginner, I even do not know if this is the correct way to solve the machine translation problem.)

The problem happens when I want to divide shared(input, output) embedding into source/target embedding (I’m using T5ConditionalGeneration). After that, there is a function self.init_weights() and here they operate self.tie_weights() method.

As far as I know, the self.tie_weights() make output embeddings the same as input embeddings. But in my case, then target embedding will be changed and cause indexing error.

My question is:

Should I call self.tie_weights() in this case?
Is there any better approach that I can attempt?

Here is the related post below:

Thanks in advance for any answers!

Topic		Replies	Views
Using tie_weights() always? (intend to separate source/target embedding) 🤗Transformers	0	4946	November 17, 2021
What is the `tie_word_embeddings` option exactly doing? 🤗Transformers	3	12762	October 15, 2022
Tied weights for encoder and decoder vocab matrix hard coded in T5? 🤗Transformers	0	874	April 24, 2023
Freezing weights of new tokens in the input embedding Beginners	2	597	September 25, 2024
How does "_tied_weights_keys" work? Beginners	0	527	January 3, 2025

Should we use tie_weights always? (intend to separate source/target embedding)

Related topics