Read data of pdf or just image format as a part of promt

I have a use want to develop, but I don’t know which workflow I should take by huggingface. First I want to load text(maybe several thousand words) which is in pdf format, and according to the text material I want to ask some questions to model(just like gpt or oasst-sft-4-pythia-12b-epoch-3.5 or some other test-generation model). I know in openai there is function can extraction main meaning of pdf, I don’t want to transfer learn the pretrained model, just ask questions and pdf text together. I have two concerns:1 is how should load the pdf, 2 is if the text is a little long, is it ok just put the text as a part of promt? ifover the tokens limit? which model should I use in huggingface for this situation? Thank you very much.