Apparently it’s not possible to train a Lora for a GPTQ model. If I’m incorrect please let me know. We’re using TheBloke/Wizard-Vicuna-30B-Superhot-8K-GPTQ which runs fast and does everything we need it to do on our Nividia A40 with 48GB of VRAM.
However, to train a Lora for this model, I’ve concluded we need to train for the original model, which is ehartford/wizard_vicuna_70k_unfiltered, which is roughly 130GB
In order to train the original model, we’ll need to lease a GPU server with 130GB of VRAM, which is roughly $3500 a month, so…can someone please tell me if a Lora trained from the original model will FOR SURE work on the GPTQ version of the model, or do I have any of this wrong?