Can you recommend a model to extract specific values from a large set of text? I am trying to do the SROIE dataset task2 - convert OCR-extracted text from receipts into total amount spent and company where the purchase was made. I am able to get ChatGPT-4 to extract the info quite nicely, but with t5-base, so far, I just get:
'<pad> True</s>'
Data is bounding box coords + extracted text.
190,864,309,864,309,880,190,880,EXCHANGEABLE
142,883,353,883,353,901,142,901,***
137,903,351,903,351,920,137,920,***
202,942,292,942,292,959,202,959,THANK YOU
163,962,330,962,330,977,163,977,PLEASE COME AGAIN !
412,639,442,639,442,654,412,654,9.00
The prompt I tried is:
input_text = f"Context: {the_data_above}\n\nQuestion: What is the total amount spent?\n\nAnswer:"
Is t5 a good model to use, or is there a better one? Any special tricks needed to getting it to use bounding box coordinates to infer text relationships?
Many thanks!