System Info
[What I used]
- Polyglot-12.8b(GPT-Neox based, EleutherAI/polyglot-ko-12.8b · Hugging Face)
- transformer version: 4.32.0.dev0
- trainer: transformers run_clm_no_trainer(accelerate) (https://github.com/huggingface/transformers/blob/main/examples/pytorch/language-modeling/run_clm_no_trainer.py)
- used Deepspeed zero3
- I added ignore_mismatched_sizes=True
model = AutoModelForCausalLM.from_pretrained(
args.model_name_or_path,
from_tf=bool(".ckpt" in args.model_name_or_path),
config=config,
low_cpu_mem_usage=args.low_cpu_mem_usage,
ignore_mismatched_sizes=True # added
)
[What I do]
- Finetuned Polyglot; it was working.
- Re-fine-tuning the 1’s model, error occured.
size mismatch for gpt_neox.layers.38.mlp.dense_h_to_4h.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([20480, 5120]).
size mismatch for gpt_neox.layers.38.mlp.dense_4h_to_h.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([5120, 20480]).
size mismatch for gpt_neox.layers.39.attention.query_key_value.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([15360, 5120]).
size mismatch for gpt_neox.layers.39.attention.dense.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([5120, 5120]).
size mismatch for gpt_neox.layers.39.mlp.dense_h_to_4h.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([20480, 5120]).
size mismatch for gpt_neox.layers.39.mlp.dense_4h_to_h.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([5120, 20480]).
size mismatch for embed_out.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([30003, 5120]).
You may consider adding `ignore_mismatched_sizes=True` in the model `from_pretrained` method.
My transformer version is lastest, and I used ignore_mismatched_sizes=True already.
But this error occured.
Can anyone know the solution of this?