Issues with fine-tuning GPT NeoX using LoRA

Hi, I’m trying to fine-tune GPT NeoX 20B using LoRA and peft - the process goes great, takes about 12 hours on my dataset, training loss is acceptable… But when it is finished, the adapter_model.bin file is very small for some reason (443 bytes) when it should have at least a few MB. When I load the adapter, the model gives unexpected outputs, which shouldn’t happen. Here is the script: alpaca-lora_gpt_neox_20b/ at main · satani99/alpaca-lora_gpt_neox_20b · GitHub (base_model is not specified, but I specified it before I began the fine-tuning process).

I might be wrong, but I think the problem is in the lora_target_modules, where the module is incorrectly specified. I tried printing them after loading the model, and there are like 120 target modules, so which one should I target, all of them?

If anyone knows what the problem is, please let me know.

I’m getting the same result of a 443 bytes adapter_model.bin file. I was running alpaca-lora/ at main · tloen/alpaca-lora · GitHub.

This fixed it for me: adapter_model.bin not being updated and only 443 bytes after finetuning · Issue #293 · tloen/alpaca-lora · GitHub

Hello, I did that, but nothing happened to the file after I did the re-install. Do I have to run some sort of script to overwrite the old file? Or what exactly have you done, other than the re-install of peft? Thanks in advance!

Update: got it fixed by resuming from the latest checkpoint, all good now

1 Like