Hey I am using ChromaDB vector for pdf data extraction for text and images and using langchain_community.llms for chat. But the project is converting the pdf file every time. So wants to save the vector conversion and save it into a seperate file and load it and use it in the chat llms again.
I am new to chromadb so don’t know how to save the collections in the vector store including texts and images and load that file again for interaction.
Any help on this would be grateful.
I will show you my examples.
If you read my reply, then give me some feedback!
I will wait for your feedback, and appreciate any feedback.
And I am willing to explain about langchain and vectorDB for you.
Hope this help!
I missed your one question.
ChromaDB can store and work with images, particularly in the context of vector embeddings. You can convert images into embeddings using a model like CLIP (Contrastive Language-Image Pretraining) or another image feature extractor, and then store those embeddings in ChromaDB. This allows you to perform tasks like image search or image classification based on the embeddings.
Thanks for your reply Mr. alan.
Those are great examples thanks for sharing.
But what solution i am looking here is to pass on 3-4 pdfs and create its vector database file with *.DB extension or *.pkl extension (those should also have extracted image files too) and then importing it and embedding it to the chatbot to query it from those metafiles and get the most nearest search result combining with llm results.
This is what i am working on :
I dont want to upload pdfs everytime,. wants to remove those pdf upload module on next run and just wants to deal with pdfs data.
Above my examples have these style samples.
First one is just a project and second and third one is the sample code using vectorDB with pdf files and convert the into DB.