finetuned a model (decapoda-research/llama-7b-hf 路 Hugging Face) using peft and lora and saved as https://huggingface.co/lucas0/empath-llama-7b. Now im getting Pipeline cannot infer suitable model classes from
when trying to use it along with with langchain and chroma vectordb:
from langchain.embeddings import HuggingFaceHubEmbeddings
from langchain import PromptTemplate, HuggingFaceHub, LLMChain
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate
from langchain.vectorstores import Chroma
repo_id = "sentence-transformers/all-mpnet-base-v2"
embedder = HuggingFaceHubEmbeddings(
repo_id=repo_id,
task="feature-extraction",
huggingfacehub_api_token="XXXXX",
)
embeddings = embedder.embed_documents(texts=comments)
docsearch = Chroma.from_texts(comments, embedder).as_retriever()
#docsearch = Chroma.from_documents(texts, embeddings)
#llm = HuggingFaceHub(repo_id='decapoda-research/llama-7b-hf', huggingfacehub_api_token='XXXXX')
llm = HuggingFaceHub(repo_id='lucas0/empath-llama-7b', huggingfacehub_api_token='XXXXX')
qa = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=docsearch, return_source_documents=False)
q = input("input your query:")
result = qa.run(query=q)
print(result["result"])
#print(result["source_documents"])
is anyone able to tell me how to fix this? Is it an issue with the model card? I was facing issues with the lack of the config.json file and ended up just placing the same config.json as the model I used as base for the lora fine-tuning. Could that be the origin of the issue? If so, how to generate the correct config.json without having to get the original llama weights?
Also, is there a way of loading several sentences into a custom HF model (not only OpenAi, as the tutorials show) without using vector dbs?
Thanks!