Medical analysis system

I have a social project that offers free screening eye tests. as of now we do the analysis and results manually, but I want to train a model to do it. I am a true beginner so any guidance on where to start will be greatly appreciated. here’s some info:

  • the exams are in pdf, with a mix of text and images
  • the end result of this project will be a UI where we could upload the pdf and get the results.

Again, i am a true beginner so i still have basic questions on how to get started. things like where to find training models, where to host, how to design the frontend, basically everything! :slight_smile:

1 Like

Let’s assume that the final goal is to extract the contents of a PDF file, including images and text, as a summary of the contents in text form.
In this case, there are two main methods

  • Converting the entire PDF to images and then analyzing it with AI
  • Decomposing the PDF into text and images and then analyzing each component with AI

If the former method is fine, you can choose one of the VLMs below and then just create the part that converts the PDF to images. If the data is too complex or the format is inconsistent, you will need to use the latter method, and it will be a case-by-case basis. I think it will be quite difficult…