Batch size limit 32

Hello, I’m currently utilizing the mixedbread-ai/mxbai-embed-large-v1 model via a dedicated endpoint, the configuration details of which I’ve provided in the attached screenshot. My goal is to use this embedding model through the endpoint to convert a PDF file into vectors and store them in a vector database using Langchain. However, I’m encountering an issue where I receive an error stating “maximum allowed batch size 32” when I run my code. I would greatly appreciate any assistance in resolving this error.

from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import Chroma
from langchain_community.embeddings import HuggingFaceHubEmbeddings
embeddings = HuggingFaceHubEmbeddings(model="https://sze1pr91t48e3kuu.us-east-1.aws.endpoints.huggingface.cloud",
                                      huggingfacehub_api_token="hf_xxxxx")



loader=PyPDFLoader('aws.pdf')
docs=loader.load()
MARKDOWN_SEPARATORS = [
    "\n#{1,6} ",
    "```\n",
    "\n\\*\\*\\*+\n",
    "\n---+\n",
    "\n___+\n",
    "\n\n",
    "\n",
    " ",
    "",
]


text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=5000,  # the maximum number of characters in a chunk: we selected this value arbitrarily
    chunk_overlap=500,  # the number of characters to overlap between chunks
    add_start_index=True,  # If `True`, includes chunk's start index in metadata
    strip_whitespace=True,  # If `True`, strips whitespace from the start and end of every document
    separators=MARKDOWN_SEPARATORS,
)

docs_processed = []
for doc in docs:
    docs_processed += text_splitter.split_documents([doc])
db = Chroma.from_documents(docs_processed,embeddings,persist_directory="./chroma_db")

1 Like

Did you get resolution!?

Already Raised PR few months ago: community: Batching added in embed_documents of HuggingFaceInferenceAPIEmbeddings by abhishek9998 · Pull Request #16457 · langchain-ai/langchain · GitHub