Looking for Direction on Best Task Types

Hi,

New to fine-tuning and training and I want to explore which types of tasks would be most suited to my use case. I have a base understanding of LLMs and various technologies like RAG and Feature Extraction etc.

I have had some inital success using base LLMs but want to see if there’s better ways of meeting my goals.

I receive complex documents on various topics (these are complex policies, training records etc. not simple invoices or receipts) and use human expertise to score these documents ultimately provide a pass/fail. There are clear documented guidelines on what makes a pass or fail but there is some subjectivity/professional opinion.

I’ve initially setup a pipeline of Content Extraction > Prompting > Scoring using various LLMs for some initial benchmarking.

I use some detailed prompts; but in summary: “Provide a score 0-100 for the document content below based on how well it matches the [CRITERIA] #Document Content [DOCUMENT CONTENT]”

What tasks/models might be best suited to scoring these complex documents against a criteria. I have a large data set of good/bad documents for training if needed and where might I start.

Thanks for any input.

1 Like