safetensors_rust.SafetensError while deserializing header: InvalidHeaderDeserialization

Hi:
I am trying to load a checkpoint after fine-tuning a codeLlama model using PEFT.
This gives me an error:
File “/home/ubuntu/.cache/pypoetry/virtualenvs/llama-VNDZLnVB-py3.8/lib/python3.8/site-packages/safetensors/torch.py”, line 308, in load_file
with safe_open(filename, framework=“pt”, device=device) as f:
safetensors_rust.SafetensorError: Error while deserializing header: InvalidHeaderDeserialization

Here is my rather simple code for loading:
base_model = “codellama/CodeLlama-7b-hf”
model = AutoModelForCausalLM.from_pretrained(
base_model,
load_in_4bit=True,
torch_dtype=torch.float16,
device_map=“auto”,
)
tokenizer = AutoTokenizer.from_pretrained(“codellama/CodeLlama-7b-hf”)
output_dir = “fine-tuned-code-llama/checkpoint-400”
model = PeftModel.from_pretrained(model, output_dir)

The model was fine-tuned in 4-bit mode. I am wondering if this is a bug loading safetensors or something I am doing incorrectly.
Thank you!

There is currently an issue in the compatibility between PEFT and Pytorch while using custom data training sets.

I encountered this issue due to the following optimizations before training. Removing these optimizations allowed my models to be correctly saved and loaded, however it did increase the VRAM requirements of training.

model.state_dict = (
lambda self, *_, **__: get_peft_model_state_dict(
self, old_state_dict()
)
).get(model, type(model))

if torch.__version__ >= "2" and sys.platform != "win32":
    model = torch.compile(model)