Fine Tune with/without LORA

bochen0909 · August 1, 2024, 9:32pm

Fine tune a 8G model on chat data with/without LORA.

With LORA:

peft_config = LoraConfig(
        lora_alpha=128,
        lora_dropout=0.05,
        r=256,
        bias="none",
        target_modules="all-linear",
        task_type="CAUSAL_LM",
    )
trainer = SFTTrainer(
        model=model,
        args=training_args,
        train_dataset=dataset,
        peft_config=peft_config,
        max_seq_length=max_seq_length,
        tokenizer=tokenizer,
        packing=True,
        dataset_kwargs={
            "add_special_tokens": False,
            "append_concat_token": False,
        },
    )

Without LORA:

    trainer = SFTTrainer(
        model=model,
        args=training_args,
        train_dataset=dataset,
        max_seq_length=max_seq_length,
        tokenizer=tokenizer,
        packing=True,
        dataset_kwargs={
            "add_special_tokens": False,
            "append_concat_token": False,
        },
    )

All other things and parameters are same. LORA gave really good results, without-LORA produced non related response.

Anyone has experience and clues on the reasons?

huggingzob · October 7, 2024, 7:39pm

Isn’t it the expected result? LORA would be easier to train so if you are using same number of epochs LORA should learn faster I believe.

Topic		Replies	Views
Training loop for LoRA 🤗Transformers	3	261	September 18, 2024
Bad Performance Finetuning Llama Chat and Instruct Models on GSM8K Beginners	5	1067	December 5, 2024
"You cannot perform fine-tuning on purely quantized models." error in LoRA model training? 🤗Transformers	3	2603	August 16, 2024
Further finetuning a LoRA finetuned CausalLM Model 🤗Transformers	17	10683	July 7, 2024
Fine tune a finetuned model Beginners	1	552	December 16, 2024

Fine Tune with/without LORA

Related topics