Fine tuning LayoutLMv2 For Token Classification on CORD dataset

doc2txt · February 2, 2024, 5:56am

I use this colab:

to Fine tuning LayoutLMv2ForTokenClassification on CORD dataset

here is the result:

F1: 0.9665

and indeed the result are pretty amazing when running on the test set
how ever, when running on any other receipt (printed or pdf) the result are completely off

So from some reason the model is overfitting, to the cord dataset, even though I use similar images for testing.
I don’t think that there is a Data leakage unless the cord DS is not clean (which I assume it is clean)

What could be the reason for this?
Is it some inherent property of LayoutLM?
The LayoutLM models are somewhat old, and it seems deserted…

I don’t have much experience so I would appreciate any info
Thanks

Topic		Replies	Views
Layoutlmv2 token classification on documents having tokens larger than 512 Models	8	2315	October 20, 2022
Optimal Approach for Fine-Tuning LayoutLMv3 for Token Classification with 80 Labels Models	3	31	May 26, 2025
Suggestions for an open source tagging tool to build custom LayoutLMv2 datasets Awesome paper	0	910	January 25, 2022
Finetune LayoutLM for multilabel document image classification Models	0	427	July 18, 2023
Tutorial: Fine-tuning with custom datasets – sentiment, NER, and question answering 🤗Transformers	19	12844	February 12, 2024

Fine tuning LayoutLMv2 For Token Classification on CORD dataset

Related topics