Model for evaluation of scanned survey data

MH-BS · January 12, 2025, 2:55pm

Dear all

I guess that there may be a good model for the following purpose. I have around 140 scanned PDFs with paper questionnaires. People have checked certain boxes or values on a likert scale. There is only one text box. I will scan these papers, but I asked myself if there maybe a model here that can be used out of the box to evaluate those scans. As I only have these 140 questionnaires, there is not enough data for a model training, I guess. Is there something so sophisticated to be used already?

Best regards
Martin

John6666 · January 12, 2025, 3:43pm

I think it would be possible to achieve this by using a library for Python to convert the PDF to an image and then passing it to a VQA model with a certain level of performance, or by building an RAG that can read PDFs, like the ones available on Spaces.

MH-BS · January 13, 2025, 7:02am

I could also directly scan the pages into images instead of PDFs. And I will look up both solutions that you recommended.

Given that there are so many paper based surveys out there I am surprised that there are not already specialized models for that.

John6666 · January 13, 2025, 8:13am

No, there might be, but it’s more likely that I just don’t know about it.
Also, if you don’t mind using images instead of PDFs, it would be much easier. If you use a general-purpose VLM of a certain size, you can ask questions with text attached to the images, and they will answer. I think it’s possible with 8B or less. The larger the size, the higher the accuracy, but the operating environment becomes more demanding.

I can’t post a link, so try searching for “VL” in Spaces…

MH-BS · January 14, 2025, 10:03am

Thank you, I will look at the spaces. Until now I only looked at models.

MH-BS · January 19, 2025, 3:28pm

Hello everyone
A few days have passed and I have tried several of the models from the VL leaderboard that I found via spaces. However, I got none of them to work. Perhaps because I only have a laptop with a CPU. Do you know a good VQA model, as supposed here with 8B or so, that could also be run on a laptop for inference?
Best regards
Martin

John6666 · January 19, 2025, 4:36pm

You’ll need about 20GB of RAM to use the 8B model. Ideally, you’d want VRAM, but that’s not usually available on laptops.
You might be able to get by with 2B or 3B, but it’s pretty slow on a CPU.
You can use the Serverless Inference API for free up to 1000 times a day, so why not give it a try?

John6666 · January 19, 2025, 4:38pm

These are easy to use in small VLM. I think there are other good ones.

MH-BS · January 19, 2025, 4:39pm

Thanks I will try them. Recently I also wanted to do somethin with Llama, but read that it is not available in the EU anymore.

John6666 · January 19, 2025, 4:55pm

Actually, If you look for HF, you’ll find it…
But personally, I’d recommend Qwen or Moondream for smaller ones.

Topic		Replies	Views
Title: Recommendations for Models that Handle Text and Screenshots for QA Models	15	1113	November 7, 2024
Any Multi Modal LLMs that take direct pdf + text as input? 🤗Transformers	2	1926	October 10, 2024
Which model to select Models	1	70	April 14, 2025
Topic : Need a good model that run locally for pdf data extraction Models	0	74	November 24, 2024
Cost of Tax receipt recognition OCR vs. LLM Models	2	236	March 22, 2025

Model for evaluation of scanned survey data

Related topics