Can LayoutLM be used for images?

eveningkid · October 24, 2020, 2:16pm

Hi,

I am very new to transformers and found out about it when looking for a LayoutLM implementation.

Now from my understanding, LayoutLM can be used to extract information from a document based on the layout it guessed.

When browsing the documentation, I could only see examples using plain text and I don’t know where to begin to put an image instead.

If it would be possible to help a newbie like me, showing how to pass it an image and how to interpret the results, you would really make me an happy man!!

I really hope someone can help me.

Have a great day

rgwatwormhill · November 18, 2020, 3:17pm

Hi eveningkid,

transformer models are designed for text.

It might be possible to force the model to accept a numeric representation of an image (after all, it’s all ones and noughts), but it would be unlikely to do anything useful.

hasansalimkanmaz · January 11, 2021, 5:43am

Especially Image embeddings are not implemented and open-sourced. You can see this thread but it should be harder according to the thread

Topic		Replies	Views
Need help in LayoutLM model Models	0	477	July 8, 2022
How to extract text using LayoutLM2 Beginners	0	1202	June 7, 2022
LayoutLMV3 embeddings Beginners	4	1109	August 3, 2022
LayoutLMV3 inference without label 🤗Transformers	0	98	May 28, 2024
Visualizing attention heatmaps of layoutlmv3 🤗Transformers	0	1107	February 25, 2023

Can LayoutLM be used for images?

Related topics