RuntimeError When Saving Phi 3.5 Vision Due to Shared Tensors

hobscrk777 · November 9, 2024, 1:34am

I’m trying to fine-tune Phi 3.5 Vision using transformers. However, I’m running into an issue trying to save the model during or after training. See below for a minimal reproducible example.

Does anyone have any pointers? This issue has been reported in a few other locations - see below.

The error suggests “saving using safe_serialization=False”…but I’m not sure what the implications of that are.

Minimal Reproducible Example

from transformers import AutoModelForCausalLM
model_id = "microsoft/Phi-3.5-vision-instruct"
model = AutoModelForCausalLM.from_pretrained(
    model_id, device_map="cuda", trust_remote_code=True, torch_dtype="auto"
)
model.save_pretrained("out", safe_serialization=True)

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ubuntu/AWSBedrockScienceModelDistillationTraining/.venv/lib/python3.12/site-packages/transformers/modeling_utils.py", line 2958, in save_pretrained
    raise RuntimeError(
RuntimeError: The weights trying to be saved contained shared tensors [{'model.embed_tokens.weight', 'model.vision_embed_tokens.wte.weight'}] that are mismatching the transformers base configuration. Try saving using `safe_serialization=False` or remove this tensor sharing.

John6666 · November 9, 2024, 1:48am

It seems that the problem is with the model itself.

github.com/kazuar/Phi3-Vision-ft

RuntimeError: The weights trying to be saved contained shared tensors [{'model.vision_embed_tokens.wte.weight', 'model.embed_tokens.weight'}]

opened 05:01AM - 01 Jul 24 UTC

kazuar

When running the `finetune.sh` script with my own dataset, I encountered the fol…lowing error during checkpoint / saving the model: ``` RuntimeError: The weights trying to be saved contained shared tensors [{'model.vision_embed_tokens.wte.weight', 'model.embed_tokens.weight'}] that are mismatching the transformers base configuration. Try saving using `safe_serialization=False` or remove this tensor sharing. ``` Setting `safe_serialization=False` resulted in a model that wasn't able to load. @2U1 did you encounter this error? (opened it here because https://github.com/2U1/Phi3-Vision-ft doesn't have an issues tab)

Topic		Replies	Views
Using Trainer to save a Bartforsequenceclassification model Beginners	3	2096	August 13, 2024
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1! Beginners	0	203	July 3, 2024
Training Fails with RuntimeError related to wrong data type Beginners	1	1468	May 6, 2022
Transformer's trainer runtime error 🤗Transformers	1	97	December 5, 2024
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:7 and cuda:0! 🤗Transformers	2	162	March 25, 2025

RuntimeError When Saving Phi 3.5 Vision Due to Shared Tensors

Minimal Reproducible Example

Related topics