BEST RAG LLM for interraction with emails and files

Hello,

I am a beginner, and I currently use an Outlook email account. I am looking for the best strategy to interact with my emails (especially confidential ones, which is why I prefer using a private or local LLM). Additionally, I want to have a daily schedule with summaries and email responses to specific questions.

In my opinion, I have two possible approaches:

  1. Download emails and their attachments, convert them to text/PDF, and use a local LLM to interact with a folder containing these files. I could also schedule a Python script to process the data, answer specific questions, and send the responses via email.
  2. Find a model that can interact directly with Outlook emails to extract the information I need and provide answers to my questions while saving the results.

The problem is that I haven’t found a powerful model that meets my needs. I’ve tried using RAG models with 4-5 files, but I wasn’t satisfied with the answers. For example, I tested GPT4All, which produced lengthy and inaccurate summaries, even when combining multiple models. I also tried LLaMA 3, but its responses weren’t reliable.

What would you recommend as the best solution, and how should I implement it?

Thank you!

2 Likes

Hi, @Oussssss !
First, I appreciate your trying to solve real world problem in outlook.
Before move onto the main topic, I don’t know your baseline, so I might have another opinion with you.
I think that you should first find the appropriate api to connect the outlook.
These are sample for you.

from msal import ConfidentialClientApplication

# Authentication details
client_id = 'YOUR_CLIENT_ID'
client_secret = 'YOUR_CLIENT_SECRET'
tenant_id = 'YOUR_TENANT_ID'

# Create a confidential client app
app = ConfidentialClientApplication(client_id, authority=f"https://login.microsoftonline.com/{tenant_id}", client_credential=client_secret)

# Get the token
token_response = app.acquire_token_for_client(scopes=["https://graph.microsoft.com/.default"])

# Access token
access_token = token_response.get('access_token')

And you can also send messages:

import requests

# Microsoft Graph API endpoint
url = "https://graph.microsoft.com/v1.0/me/messages"

# Headers with Authorization token
headers = {
    'Authorization': f'Bearer {access_token}',
    'Content-Type': 'application/json'
}

response = requests.get(url, headers=headers)

# Print response (emails)
emails = response.json()
print(emails)
url = "https://graph.microsoft.com/v1.0/me/sendMail"

email_data = {
    "message": {
        "subject": "Test Subject",
        "body": {
            "contentType": "Text",
            "content": "This is a test email."
        },
        "toRecipients": [
            {
                "emailAddress": {
                    "address": "recipient@example.com"
                }
            }
        ]
    }
}

response = requests.post(url, headers=headers, json=email_data)

# Check if email was sent successfully
if response.status_code == 202:
    print("Email sent successfully.")
else:
    print("Failed to send email.")

Then, choose the model is challenging issue. But you can select the model for your baseline. I know the GPT models are best in the field.
Hope this help!

2 Likes

Thanks for your reply , i think my biggest issue is to find the model that can interract with my emails/pdf

1 Like

Is there anything to help me?
I have experienced to develop the chatbot, then I used RAG.
At that time I used vectorDB, then the response time is too long.
How can you solve this problem. Or are you going to fine tune the model?
I think that you can get a satisfied result if you use good prompt!

1 Like

Hello, why my post is hidden ?

1 Like

The automatic spam filter has become unusually strict because of the persistent trolls attacking the forum.:sob:
Well, if it’s a post that’s not a problem, the staff will fix it later.