LoRA training with accelerate / deepspeed

I’m training LoRA adapters large models using Accelerate + DeepSpeed, with ZeRO-3. I’m using alignment-handbook implementation for that. However, when saving checkpoints, the full model is being saved, while I need only the adapter (and possibly the optimizer states). Is this possible to configure the training such that save only the adapters are saved?

Or should I used a MULTI_GPU setup instead of DeepSpeed?

Model to be trained is Llama-70B, on 8*A100 40GB, so it’s really not possible to fit the model on a single GPU.