We’ve tried reproducing the results from alpaca_lora and are getting weird results:
- the official weights work fine for the 7b model
- the adapters we have when we finetune the same model using the official finetuning script result in adapters that output nonsensical results when used with the official generation script.
I suspect something is going wrong with the finetuning: hyperparameter reproduction, RNG seeds, dataset formation and formatting, or something else that’s hard to catch.
Any input from the alpaca-lora people or PEFT/LoRA developers would be welcome