to get all my financial history into Firefly III I have to extract the tables in my bank statements, which are available as PDFs, to a CSV.
I started once to build an app with Camelot: PDF Table Extraction similar to A table detection, cell recognition and text extraction algorithm to convert tables in images to excel files | by Hucker Marius | Towards Data Science but covering all cases it just a nightmare.
Some weeks ago I read Structured Approach to Extract Information from Unstructured Documents | by Md Sharique | Analytics Vidhya | May, 2022 | Medium (and others like MaskR-CNN, YOLO).
The approach there sounds quite promising but I would like to verify if that is reasonable to persuade this idea as accuracy has to be somewhere close 100% (I still can add cross-check sum of all items against the given total and add/modify things manually)?
If so, how to ensure confidential training as I don’t want to share my statements with a cloud?
Would very kind of some advice would be given.