Getting Started Use Case - Summarization via Email

So I am business user (not developer) but really interested in how HF can help me and others like me with real world business challenges. I have completed the course so have a decent Idea of how what challenges can be solved and this is one that would benefit me.

I will describe below and keen to understand if perhaps this has already been done, there are shortcuts or best practices to achieving what I want to do before I start to cut some code. I will say again I am not a developer but have OK understanding of Python and have written some scripts for me in the past so good to try this out.

Challenge

I get a lot of news links from others in my team about products relating to my business area. These could be press releases, editorial pieces from trade publications or blog posts from competitor sites. It takes time to read all these an a lot of times its just not that useful but I have to take the time to process to figure that out.

Where HF could help

What I would like to do is forward these to a unique email address I would create.

This mailbox would pick up these emails. Check for existence of a simple code (to avoid spam requests) and if that code exists it will look to see if there is a link (URL) in the article, if so call inference on the URL contents OR if not it will assume the text of the article is in the mail body and call inference with that.

It will email back the return - hopefully summarization result or an error if there was a problem. If the summarization looks interesting its likely I will then read the article in full otherwise not.

Where this could go

Ultimately I can see this being a feature of mail client, something the user would elect into and these results would be delivered automatically on the client itself, would not surprise me if this appears in a future version of Outlook just question of when!

Many Thanks

Hi Ravi, thanks for opening this thread, that sounds like an interesting idea :slight_smile:

I will assume that you are looking for guidance on the summarisation part and not the architecture involving the end-to-end solution (i.e. webscraping, automatically parsing emails, queuing systems, etc).

If so, then I will shamelessly plug a blog post I have written recently that gives a (hopefully) gentle introduction into how to set up a summarisation project with HF: Setting up a Text Summarisation Project | by Heiko Hotz | Dec, 2021 | Towards Data Science

It’s a bit of a longer read, but it is geared towards readers that don’t necessarily have strong HF and/or machine learning expertise.

I hope this is useful, please let me know if you have any questions on that.

Cheers
Heiko

1 Like

This is great. Thanks for sharing.

If I was starting a project like this today, I would probably just use a tool like Zapier to receive emails sent to a email parsing alias, and then use open AI APIs to perform the summarization. Zapier has a good email parse function that extracts the text parts intelligently, which would then be easy to pass on for URL extraction using another one of their native functions. You’d then fetch the URLs using a service like https://www.browserless.io/, which uses headless browsers (for scraping purposes), passing the DOM back where you can extract the raw text and ship it off to OpenAI or wherever.

In other words, I think these days you can build this with a no code option. And by the way, I think this is a fantastic idea!