AutoModelForCausalLM() to HuggingFaceLLM

rigvedrs · February 14, 2024, 12:54pm

Hey, I finetuned the LLama Model using PEFT and QLoRA, and load the model as follows:

PEFT_MODEL = "/kaggle/working/trained-model"

config = PeftConfig.from_pretrained(PEFT_MODEL)
model = AutoModelForCausalLM.from_pretrained(
    config.base_model_name_or_path,
    return_dict=True,
    quantization_config=bnb_config,
    device_map="auto",
    trust_remote_code=True
)

tokenizer=AutoTokenizer.from_pretrained(config.base_model_name_or_path)
tokenizer.pad_token = tokenizer.eos_token

model = PeftModel.from_pretrained(model, PEFT_MODEL)

I want to use this finetuned model for my RAG pipeline that uses llama index. The issue here is that the functions of llama index need the model to be loaded using:

from llama_index.llms.huggingface import HuggingFaceLLM
from llama_index.core.prompts import PromptTemplate
llm = HuggingFaceLLM(
    model_name=model_loc,
    tokenizer_name="meta-llama/Llama-2-7b-chat-hf",
    #query_wrapper_prompt=PromptTemplate("<|system|>Please check if the following pieces of context has any mention of the keywords provided in the question.If not ten say that you do not know the answer.Please do not make up your own answer.</s>\n<|user|>\nQuestion:{query_str}</s>\n<|assistant|>\n"),
    # query_wrapper_prompt=PromptTemplate(template),
    context_window=4096,
    max_new_tokens=512,
    # model_kwargs={'trust_remote_code':True},
#     model_kwargs={"n_gpu_layers": -1},
    generate_kwargs={"temperature": 0.5},
    device_map="auto",)

Otherwise I will get the issue (AttributeError: ‘LlamaForCausalLM’ object has no attribute ‘predict’) if I run the following code:

response = sentence_query_engine.query(query)

Is there a way to address this? (Like maybe converting the CausalLM class to HuggingFaceLLM class) Also if there isn’t, can you suggest me how can I finetune the model for RAG so that I can load it using HuggingFaceLLM class and use it in my code base using LLamaIndex?

nelkh · May 27, 2024, 1:24pm

I have the exact same problem since I’m not using Ollama anymore… Did you find a solution ?

Zixa · October 4, 2024, 1:59pm

Same issue guys !!

Topic		Replies	Views
Running LLaMA 3.1 8B Model Downloaded from Meta - Missing Configuration File Beginners	3	561	December 30, 2024
Attempting to unscale FP16 gradients 🤗Transformers	3	8023	June 10, 2024
AutoModelForCausalLM and transformers.pipeline Beginners	2	627	August 29, 2024
Finetuning Meta-Llama-3.1-8B using PEFT Models	4	3458	February 1, 2025
Loading a retrained model locally Beginners	2	2432	February 5, 2024

AutoModelForCausalLM() to HuggingFaceLLM

Related topics