When using LoftQ as a finetuning method, you would need to initialize the model first. Quantization is only done under the hood by loftq_config. On the other hand if you were to use QLoRA, you can quantize the model directly as a parameter under from_pretrained before finetuning.
Yep, which is why LoftQConfig was a confusing addition. You are meant to apply the LoftQ technique to a full-precision pre-trained weight first, as seen here. From there on, you can quantize and save the model, so that in the future you would only need to load the quantized model.
My suggestion is to either use the already available “LoftQ-applied” models on HuggingFace, or stick to QLoRA.