Hello everyone,
I’m writing this post to seek your opinion on the methodology I’m using to extract metadata from a PDF document. My idea was to utilize one of the many Python libraries to extract text from a PDF (or use OCR if the file isn’t text-based) and use this text as the “context” for a Language Model (LLM) to perform static queries (such as determining the total amount of the invoice). Do you think this is a valid approach? Can you suggest better approaches? I’d like to minimize annotation phases, which is why I prefer this approach over LayoutLMv3 or Donut for feature extraction.
Thank you!