I am attempting to train a SFT model using the PPOTrainer but receive the following error when initializing PPOTrainer. This doesn’t appear to be an error caused by something I’m passing to PPOTrainer but I could be mistaken. Note: both the SFT model and the Reward model are quantized (LoRA adapters have been merged using merge_and_unload()
.
- SFT model: braunagn/joeyGPT-sft-merged-v1
- Reward model: braunagn/joeyGPT-reward-merged-v1
/usr/local/lib/python3.10/dist-packages/bitsandbytes/nn/modules.py in __setstate__(self, state)
247 self.quant_state = state["quant_state"]
248 self.data = state["data"]
--> 249 self.quant_storage = state["quant_storage"]
250 self.bnb_quantized = state["bnb_quantized"]
251 self.module = state["module"]
KeyError: 'quant_storage'
The initialization is straight forward (I followed these instructions):
ppo_trainer = PPOTrainer(
config=ppo_config,
model=sft_model,
ref_model=None, # will re-use `model` if set to None
tokenizer=tokenizer,
dataset=dataset,
optimizer=None, # None defaults to Adam optimizer with linear learning rate specific in PPOConfig
num_shared_layers=None, # None defaults to all layers are shared
)
→ Please see this notebook (last cell) for reproducible error.