PPOTrainer: KeyError: 'quant_storage'

I am attempting to train a SFT model using the PPOTrainer but receive the following error when initializing PPOTrainer. This doesn’t appear to be an error caused by something I’m passing to PPOTrainer but I could be mistaken. Note: both the SFT model and the Reward model are quantized (LoRA adapters have been merged using merge_and_unload().

  • SFT model: braunagn/joeyGPT-sft-merged-v1
  • Reward model: braunagn/joeyGPT-reward-merged-v1
/usr/local/lib/python3.10/dist-packages/bitsandbytes/nn/modules.py in __setstate__(self, state)
    247         self.quant_state = state["quant_state"]
    248         self.data = state["data"]
--> 249         self.quant_storage = state["quant_storage"]
    250         self.bnb_quantized = state["bnb_quantized"]
    251         self.module = state["module"]
KeyError: 'quant_storage'

The initialization is straight forward (I followed these instructions):

ppo_trainer = PPOTrainer(
    config=ppo_config,
    model=sft_model,
    ref_model=None,  # will re-use `model` if set to None
    tokenizer=tokenizer,
    dataset=dataset,
    optimizer=None,  # None defaults to Adam optimizer with linear learning rate specific in PPOConfig
    num_shared_layers=None,  # None defaults to all layers are shared
)

→ Please see this notebook (last cell) for reproducible error.