I think PyTesseract is easy to use from Python. Since the layout is not complicated, there seem to be many models that can potentially handle this…
https://discuss.huggingface.co/search?q=ocr%20order%3Alatest
LLama3.2-vision: Works to some extent, but not reliable for precise character reading.
Llama 3.2 Vision isn’t bad, but Aya Vision and Qwen 2.5 VL are slightly better as trained models, so it might be worth trying them out.