I want to score a bunch of paper survey forms. They are supposed to circle the choice they select. The choices are T/F or N/S/O/A for Never, Sometimes, Often or Always. Note that they can also change their mind and cross out or scratch out an answer and circle another.
I tried OCR, but of course it focused on the text, which I don’t care about, all I want to know what is the question number and what was selected. I would like it to turn into a file like this:
I’d be happy to create a small number of pages of training data, but it’s a PITA so I want to keep it the minimum. I’m not sure how to train a system to throw away all the question text and focus only on the question number and what is circled. I don’t have a GPU either so I’m hoping to get something which requires minimal training