Data augmentation FUNSD dataset & LayoutLMv3

Hi all,
So, I just created my first LayoutLMv3 model for token classification over the FUNSD dataset. Now, I would like to fine tune it with my own version of FUNSD dataset. But, since the amout of documents is not big enough, data augmentation comes to mind.

I need some guidance on this topic. Sice text extraction from documents is a big part of this problem, I don’t think any kind of transformation over the original image is valid to obtain a new one (resizing, blurring, changing background colors, to name a few, could negatively impact on text extraction).

Is there any data augmentation technique that I could implement safely to get new valid data?

Greetings

@nielsr is your answer here aplicable in this specific scenario?

Hi,

No since the text to be extracted will change if you use things like random cropping or flipping.

For document AI, one typically applies augmentation like here: https://github.com/facebookresearch/nougat/blob/f5d2cd525979e24c01c72fe223feff2eda555a0c/nougat/transforms.py. Things like erosion, dilation, bitmap transformations (which preserve the content of the images).

1 Like