How to match order documents with invoice documents and validate the content

Hi,

I would like to automate some of my administrative tasks.
As a total data/AI beginner I was wondering if there was a way to match orders (csv, pdf and image formats) to invoices (pdf and images)
I saw there was some transformers (sentence similarity) that i can use to match items from the order and the invoice.

I managed to extract all the information from the documents and I did some sentence matching using AWS Textract then using all-MiniLM-L6-v2 to match the items. This way I validate that each item in the order is or not in the invoice. This process is far from perfect (compute time is a bit long and business logic is a bit off) but it works.

My issue is that for this small experiment, I gave the model the order and the matching invoice, in the real case, I would have a bunch of orders and a bunch of invoices and I would need to match them (very time consumming for a human and would take quite some compute time using my experiment).

How would you do to match order+invoice then validate the content of the invoice?

Thanks a lot