In LayoutLMv2, TIA and MVLM

cog · June 24, 2022, 2:24am

hello.

i m trying reproduce LayoutLMv2 pretraining performance.

But I faced problem with compute TIA loss

In LayoutLMV2 paper,

TIA(Text-Image-alignment) explain like this,

When MVLM and TIA are preformed simultaneously, TIA losses of the tokens masked in MVLM are not taken into account. THis prevents the model from learning the useless but straighforward correspondence from ‘[MASK]’ to ‘[Covered]’

i understand that means when setting tia_label, if token masked with MVLM’s [MASK] , ignore that token to prevent model train [MASK] to [Covered].

So if input embedding will be generate like this

input :

 ['a']['b']['mask']['d']['e']['f']['mask']

in line :

   #line1 : ['a']['b']['mask']['d']
   #line2  : ['e']['f']['mask']

TIA perform cover line 15% probability, in this case, assume line2 will covered,

total input’s TIA_label will be

#[       line1         ][     line2       ]
 ['a']['b']['mask']['d']['e']['f']['mask']
 [notCover][notCover][ignore][notCover]  [Covered][Covered][ignore]

But, TIA loss compute the binary cross entropy loss

how they compute loss?

thanks.

Topic		Replies	Views
Visual Tokenization / Masking In BEIT & LayoutLMv3 Intermediate	1	544	December 23, 2022
Layoutlmv2 token classification on documents having tokens larger than 512 Models	8	2315	October 20, 2022
Fine-tune MLM in Roberta custom loss (additional component) Beginners	4	345	March 20, 2024
BertForMaskedLM train 🤗Transformers	2	784	January 20, 2021
LayoutLMv3 missing visual tokenizer? Beginners	7	481	January 4, 2023

In LayoutLMv2, TIA and MVLM

Related topics