What can transformers learn without position encoding?

mvonwyl · June 10, 2021, 8:18am

My intuition would be that the transformers would still have a notion of context. It would still know this word appear in context with those other words, but would lose the notion of order loosely associated with position embeddings. Also, it would still allow word embeddings to change depending on the other words in context. So it would still be better than word2vec, which only has one embedding by word (learned as a combination of several contexts).

Topic		Replies	Views
Conceptual questions about transformers 🤗Transformers	10	1092	August 26, 2021
Use transformer without position embeddings being added? Beginners	0	872	June 13, 2021
Are transformer-based encoders just "text embeddings"? Beginners	0	1289	March 13, 2023
Why positional embeddings are implemented as just simple embeddings? Beginners	7	8166	October 27, 2023
`BertEmbeddings` contains positional embedding? 🤗Transformers	2	3168	December 27, 2022

What can transformers learn without position encoding?

Related topics