Darshan Hiranandani : Implementing Website Search Functionality for Custom Chatbots Using RAG

Hi everyone,

I’m Darshan Hiranandani, new to Hugging Face and have been exploring its capabilities. I’m really excited about the possibilities here!

I’m wondering if it’s possible to replicate the “domain search” feature, similar to what’s available in HuggingChat, for my own custom chatbots. Essentially, I’d like to use a retrieval-augmented generation (RAG) approach.

Is there an easy way to crawl or connect to data from a website URL for this purpose? If so, I’d appreciate it if you could share any relevant tools, examples, or guidance on how to set this up.

Thanks so much for your help!
Regards
Darshan Hiranandani

1 Like

Hi. The forum is currently under strict regulations, so I can’t post many links. Please search using keywords…:sweat_smile:
In addition to the methods explained below, the framework called “smolagents” was recently released by HF and is proving popular. It’s probably closer to what you’re looking for.

Yes, it’s possible to replicate a “domain search” feature similar to HuggingChat by using a retrieval-augmented generation (RAG) approach. You can crawl or connect to data from a website URL using web scraping or APIs. Tools like BeautifulSoup and Scrapy for web scraping, or langchain for integrating external data sources into language models, can help with this. You can extract relevant information from the site, index it, and use it to enhance your chatbot’s responses. For RAG, combining a search index (like FAISS or ElasticSearch) with a language model (such as GPT) allows the chatbot to retrieve contextually relevant information before generating responses.

1 Like