What model will fit better for Email Parsing and Data Extraction

willthrom · January 24, 2024, 7:49am

Hi,

Firstly, Apologies if that is not the right section in the forum… (quite new here)

Now, I am working in a new project, well in a new idea to automate a current very manual process:

Lots of emails come in which are manually processed by “a human” to extract data from them which are then copied in a tabulate software (excel) or similar.

I can clearly identify the attributes/fields I want to extract from the text/emails.

I have too the types of “text/emails”

The “problem” is to be able to extract that information accurately.

Email Example:

"Hello my friends.

we have two guys arriving tomorrow 23/01/2024 around 1pm from Madrid, flight ABC123
Another one people leaving on sunday same week around 2am to Instanbul, flight CBA321

Name of the arrving ones: Jose Mateo Feliz y Ana Triste del Carmen.
Name of the leaving Matia Nodoyuna

Regards"

And basically the output required should be a JSON with a fix and depict attribute list filled (or left empty if not found).

I have been playing with Mistral and Llama, they are ok, a little slow (different Q tested and B), but I believe my scope is quite small so with the right “small” model and some training (I have thousand of emails examples), I could get a much faster and accurate model…

Any thought?

Thanks in advance!

rebhaa · May 16, 2024, 4:18pm

Hi I’m working with a similar use case. Curious to know if you have found a successful solution. I’ve tried Bert Squad but it was sub optimal. Gpt3.5 is the best but I’m looking for an open source option.

Topic		Replies	Views
What model(s) to use? Beginners	0	239	April 24, 2023
Purely extractive Language Models? Beginners	2	582	November 28, 2023
Email classification, labeling and entity classification/extraction Beginners	2	666	June 6, 2024
LLM for analysing JSON data Research	1	400	December 12, 2024
Advice on an email classification problem Beginners	3	429	August 27, 2024

What model will fit better for Email Parsing and Data Extraction

Related topics