Need Suggestion

Hi all,

I’m new to this LLM’s world and I need suggestions on the following idea. I want to fine-tune a LLM model based on exam past papers, which are currently available on PDF format.

Objective: The model understand and explain past answers and able to generate new questions and answers.

  1. Which LLM model I should select for this purpose? Keep in mind I want to start from very basic as I mentioned earlier I’m very novice.

  2. The past papers are in PDF format with pictures as well, Do I need to convert them on some specific format like JSON?


You may look at huggingface llms leaderboard (search on google for exact link), and try to pick the latest models for your tasks.
For pdf, I am not sure, but I guess you may convert them to text format some , maybe in csv format or json etc.
For finetuning llm, please look at the topics such parameter efficient finetunig (PEFT).

I am nit exactly sure but you may also look at “vision llm” that works on images and text.
I hope this helps.
Good luck.

Thank you @PervaizKhan for your reply.