Here is the code snippet:
“”"
from langchain.embeddings import HuggingFaceEmbeddings
embeddings_model_name = os.environ.get(‘EMBEDDINGS_MODEL_NAME’, ‘all-MiniLM-L6-v2’) I have attached the pertinent section of code below: I def main():
# Create embeddings
embeddings = HuggingFaceEmbeddings(model_name=embeddings_model_name)
if does_vectorstore_exist(persist_directory):
# Update and store locally vectorstore
print(f"Appending to existing vectorstore at {persist_directory}")
db = Chroma(persist_directory=persist_directory, embedding_function=embeddings, client_settings=CHROMA_SETTING>
# db = Chroma(persist_directory=persist_directory, embeddings, client_settings=CHROMA_SETTINGS)
collection = db.get()
texts = process_documents([metadata['source'] for metadata in collection['metadatas']])
print(f"Creating embeddings. May take some minutes...")
db.add_documents(texts)
else:
# Create and store locally vectorstore
print("Creating new vectorstore")
texts = process_documents()
print(f"Creating embeddings. May take some minutes...")
# Modify statement below from embeddings to embedding_function=embeddings
# db = Chroma.from_documents(texts, embeddings, persist_directory=persist_directory)
db = Chroma.from_documents(texts, embedding_function=embeddings, persist_directory=persist_directory)
# db = Chroma.from_documents(texts, embedding_function=embeddings.embed_documents, persist_directory=persist_d>
db.persist() db = None
print(f"Ingestion complete! You can now run privateGPT.py to query your documents")
“”"
Here is the error message I get after running:
“”"
I Loading new documents: 100%|██████████████| 18823/18823 [20:54<00:00, 15.01it/s]
Loaded 56261545 new documents from /mnt/source_documents
Split into 80123280 chunks of text (max. 500 tokens each)
Creating embeddings. May take some minutes…
Traceback (most recent call last):
File “/home/fdavidg/ollama/ollama-0.5.4/examples/langchain-python-rag-privategpt/ingest.py”, line 179, in
main()
File “/home/fdavidg/ollama/ollama-0.5.4/examples/langchain-python-rag-privategpt/ingest.py”, line 169, in main
db = Chroma.from_documents(texts, embedding_function=embeddings, persist_directory=persist_directory)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/fdavidg/ollama/ollama-0.5.4/examples/langchain-python-rag-privategpt/GPTAllDocs/lib/python3.12/site-packages/langchain/vectorstores/chroma.py”, line 612, in from_documents
return cls.from_texts(
^^^^^^^^^^^^^^^
File “/home/fdavidg/ollama/ollama-0.5.4/examples/langchain-python-rag-privategpt/GPTAllDocs/lib/python3.12/site-packages/langchain/vectorstores/chroma.py”, line 567, in from_texts
chroma_collection = cls(
^^^^
TypeError: langchain.vectorstores.chroma.Chroma() got multiple values for keyword argument ‘embedding_function’ I
“”"
I have checked versions and chromadb is up to date.
I am new to most of this AI programming, but have experience with Python. I am at a loss to determine what I am doing wrong. Any help, please!
Thank you.
Sincerely,
Frank D Gunseor