Hi Team,
As I choose I’m a beginner , I am developing the Chatbot
Usecase : chat with a document source(Size less than 1GB).
Current State : Stored the document chunk and retrieving it based on the user question.
Now I wanted to beautify and structure the junk and give as a response.
To do that beatifying what model I have to use ? any advise ? Please help to address me with next step.
1 Like
hi @Arulprasath
We’re talking about HTML Beautifying, right? Why don’t you use simple solution like Beautiful Soup Documentation — Beautiful Soup 4.4.0 documentation to get text?
I guess you’re building kind of RAG. If your system can retrieve relevant part based on the user question, you might give a shoot question answering
pipeline:
from transformers import pipeline
question_answerer = pipeline("question-answering")
question_answerer(
question="user question",
context="context what you retrieve and beautify",
)
1 Like
Hi @mahmutc , Thanks for the info given.
I’m not looking for the UI cosmetic changes. That streamlit can help.
I finished until the chunk matching for the user question. In the junk I am getting the exact text i loaded with a broken statement. as a next step what I can use for correcting and structuring the mach chunk ?