HF community members
I wonder how do you think about the copying mechanism for transformer.
I can see very few papers/tech reports implementing copying mechanism for transformer.
Also, I couldn’t find anyone who discusses copying mechanism in this forum.
Personally, I am stuck with computing ‘generating-copying switch’ since transformer does not have explicit ‘context vector’ in RNN.
Do you have any thoughts about the lack of reference/discussion for copying mechanism?
Is it worth implement & contribute to HF community with copying mechanism?