Optimal Approach for Fine-Tuning LayoutLMv3 for Token Classification with 80 Labels

hugobee · May 26, 2025, 11:29am

Hello everyone,

I’m trying to extract medical information from PDF files using LayoutLMv3 for token classification.

I’ve successfully fine-tuned the model for a few different kinds of tokens (name, date of birth, patient ID, etc.), but now I want to scale up to around 80 different labels.

I’m wondering if it’s better to train one model for all labels or to decompose the task into multiple specialized models (like just models of around 10 labels). Any advice or experiences would be greatly appreciated!

Has anyone encountered a similar issue or have any advice on the best approach? Thanks in advance for your help!

Have a good day,

Hugo

John6666 · May 26, 2025, 1:13pm

if it’s better to train one model for all labels or to decompose the task into multiple specialized models (like just models of around 10 labels)

Looking at the dataset used to train LayoutLMv2, it seems that a number of items within 20 is more appropriate. I think v3 probably has similar characteristics.

Well, small models are often not suitable for processing many items at once, so it is safer to divide them into multiple models. Even if you continue to train a single model, it is a good idea to save the current successful weights somewhere.

hugobee · May 26, 2025, 2:57pm

Thanks you for your response! I’m gonna try that

system · May 27, 2025, 8:08am

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Finetune LayoutLM for multilabel document image classification Models	0	428	July 18, 2023
Why is LayoutLMv2 Bad at Token Classification? Beginners	0	408	June 17, 2023
Layoutlmv2 token classification on documents having tokens larger than 512 Models	8	2315	October 20, 2022
LayoutLMv3 for tokenClassification-within-a-table/Table Extraction Beginners	0	755	November 6, 2023
Image Token classification LayoutLMv3 Beginners	0	354	November 7, 2023

Optimal Approach for Fine-Tuning LayoutLMv3 for Token Classification with 80 Labels

Related topics