Pipeline cannot infer suitable model classes from: <model_name>

finetuned a model (decapoda-research/llama-7b-hf 路 Hugging Face) using peft and lora and saved as https://huggingface.co/lucas0/empath-llama-7b. Now im getting Pipeline cannot infer suitable model classes from when trying to use it along with with langchain and chroma vectordb:

from langchain.embeddings import HuggingFaceHubEmbeddings
from langchain import PromptTemplate, HuggingFaceHub, LLMChain
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate
from langchain.vectorstores import Chroma

repo_id = "sentence-transformers/all-mpnet-base-v2"
embedder = HuggingFaceHubEmbeddings(

embeddings = embedder.embed_documents(texts=comments)
docsearch = Chroma.from_texts(comments, embedder).as_retriever()
#docsearch = Chroma.from_documents(texts, embeddings)

#llm = HuggingFaceHub(repo_id='decapoda-research/llama-7b-hf', huggingfacehub_api_token='XXXXX')
llm = HuggingFaceHub(repo_id='lucas0/empath-llama-7b', huggingfacehub_api_token='XXXXX')
qa = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=docsearch, return_source_documents=False)

q = input("input your query:")
result = qa.run(query=q)


is anyone able to tell me how to fix this? Is it an issue with the model card? I was facing issues with the lack of the config.json file and ended up just placing the same config.json as the model I used as base for the lora fine-tuning. Could that be the origin of the issue? If so, how to generate the correct config.json without having to get the original llama weights?

Also, is there a way of loading several sentences into a custom HF model (not only OpenAi, as the tutorials show) without using vector dbs?


Also asked and answered on python - Pipeline cannot infer suitable model classes from: <model_name> - HuggingFace - Stack Overflow

TL;DR: Google Colab