Why is LayoutLMv2 Bad at Token Classification?

fffiend · June 17, 2023, 8:19am

So for a pretrained model trained on invoice data, I expected this to run pretty smoothly. I gave it the following 9 labels to classify on the invoice document below:

labels = ['Contact Info','Address','Time','Date','Cost','Title','Table','Logo','Signature']

And this is the result (The picture is available on google images if you search “invoice pic”):

This is just a screenshot of the full document, and I haven’t truncated anything. I thought this was supposed to give you great results right out of the box but apparently not? It’s grabbing all the text, but it’s heavily misclassifying it.

Any tips on what to do?

P.S: Can LayoutLMv2 be used to extract the layout of GUI software screenshots perhaps? Or is there already another model that does that?

Thanks.

Topic		Replies	Views
Layoutlmv2 token classication inference with Pipeline 🤗Transformers	0	377	June 1, 2022
LayoutLMV3 for Token Classification 🤗Transformers	7	4419	June 19, 2025
Fine tuning LayoutLMv2 For Token Classification on CORD dataset Beginners	0	317	February 2, 2024
LayoutLMV3 information extraction from invoice Awesome paper	2	1002	September 22, 2024
Finetune LayoutLM for multilabel document image classification Models	0	432	July 18, 2023

Why is LayoutLMv2 Bad at Token Classification?

Related topics