finetuned a model (decapoda-research/llama-7b-hf · Hugging Face) using peft and lora and saved as https://huggingface.co/lucas0/empath-llama-7b. Now im getting
Pipeline cannot infer suitable model classes from when trying to use it along with with langchain and chroma vectordb:
from langchain.embeddings import HuggingFaceHubEmbeddings from langchain import PromptTemplate, HuggingFaceHub, LLMChain from langchain.chains import RetrievalQA from langchain.prompts import PromptTemplate from langchain.vectorstores import Chroma repo_id = "sentence-transformers/all-mpnet-base-v2" embedder = HuggingFaceHubEmbeddings( repo_id=repo_id, task="feature-extraction", huggingfacehub_api_token="XXXXX", ) embeddings = embedder.embed_documents(texts=comments) docsearch = Chroma.from_texts(comments, embedder).as_retriever() #docsearch = Chroma.from_documents(texts, embeddings) #llm = HuggingFaceHub(repo_id='decapoda-research/llama-7b-hf', huggingfacehub_api_token='XXXXX') llm = HuggingFaceHub(repo_id='lucas0/empath-llama-7b', huggingfacehub_api_token='XXXXX') qa = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=docsearch, return_source_documents=False) q = input("input your query:") result = qa.run(query=q) print(result["result"]) #print(result["source_documents"])
is anyone able to tell me how to fix this? Is it an issue with the model card? I was facing issues with the lack of the config.json file and ended up just placing the same config.json as the model I used as base for the lora fine-tuning. Could that be the origin of the issue? If so, how to generate the correct config.json without having to get the original llama weights?
Also, is there a way of loading several sentences into a custom HF model (not only OpenAi, as the tutorials show) without using vector dbs?