Extracting metadata from images using LLMs

I would like to know which large language model, or combination of large language models, does the best job of accurately extracting names, dates, locations, and languages from images. Does Hugging Face provide any LLMs that accomplish this objective?

1 Like