Hello,
Given an plain text email thread, I am trying to extract the body of the most recent email.
I used to do that with rules. Now I am testing Large Language Models (LLM) to see if I they provide a less ad hoc solution.
Mistral-7B-Instruct, for instance, seems to understand the task and provides acceptable outputs most of the time.
However, in some cases, it explains the email rather than just copy/paste the relevant chunk.
I have tried dozens of prompts, for instance:
instruction = 'Given the email thread bellow the dotted line, extract verbatim the body of the most recent (top) message. Remove all headers, footers and disclaimers. In your response, do not add any text that was not present in the original message'
And tried to prevent hallucinations by setting the following:
generation_output = model.generate(
model_inputs,
do_sample=True,
temperature=0.0000001,
top_p=0.0000001,
top_k=1,
max_new_tokens=words
)
However, in a few cases, the model still adds explanations and/or hallucinates a bit.
My questions are the following:
-
Are you aware of any models that could do a better job without fine-tuning? For instance, purely extractive models (as opposed to generative ones).
-
If generative models are the way to go, is there a way to force the model to just copy/paste?
Best,
Ed