Extracting metadata from images using LLMs

tyranheaton · June 17, 2025, 7:37pm

I would like to know which large language model, or combination of large language models, does the best job of accurately extracting names, dates, locations, and languages from images. Does Hugging Face provide any LLMs that accomplish this objective?

Mdrnfox · June 17, 2025, 7:41pm

Look into Donut, TrOCR, Mistral, Llama

I would use Donut and feed into Llama and use a hybrid approach possibly

John6666 · June 18, 2025, 2:49am

If you’re looking for something related to OCR, this recently released model might be also a good choice.

Topic		Replies	Views
Open-Source Fine-tuned LLM Models for Data Extraction Tasks Models	1	1690	September 24, 2024
Seeking Advice on Named Entity Recognition with AI Beginners	6	650	February 5, 2025
Seeking advice on selecting the best OCR model for business card recognition Beginners	4	800	March 6, 2025
How to Use HuggingFace free Embedding models Beginners	3	5667	October 7, 2024
Title: Recommendations for Models that Handle Text and Screenshots for QA Models	15	1051	November 7, 2024

Extracting metadata from images using LLMs

Related topics