How to deal with parsing method

I’m making my own llm and I’m very interested in how to parse documents. I want to get the information I want by parsing various pdf or docx files, but the text is fine. But the image and table information were difficult. Is there a topic to update sota in the community or paper with code that gets information about this topic?