When I use Trainer API to train the GLM Model and save this model,I find memory of the finetuned model is twice the size of the original model. What is the reason for this?

this is origin model,it occupies 678MB. However, finetuned model occupies 1.33GB
图片

TrainingArguments Setting

Prepare the trainer and start training

training_args = TrainingArguments(
        output_dir=output_dir,
        num_train_epochs=2,
        per_device_train_batch_size=1,
        per_device_eval_batch_size=2,
        eval_steps=10,
        save_steps=10,
        warmup_steps=100,
        learning_rate=1e-5,
        fp16=True,
        do_train=True,
        do_eval=True,
        save_strategy='steps',
        save_total_limit=3,
        evaluation_strategy='steps',
        load_best_model_at_end=True,
        logging_steps=50,
        logging_dir='./logs')

trainer = MyTrainer(
    model,
    training_args,
    train_dataset=train_dataloader,
    eval_dataset=dev_dataloader,
    data_collator=collate_fn,
    tokenizer=tokenizer
    )  # 在定义了tokenizer之后,这里的data_collator

Thanks for answering

finetuned model occupies 1.33GB

That’s probably because the original model was saved in float16 or bfloat16. But you can’t train in those precisions (fp16=True uses mixed precision training), so the model saved by the Trainer is in float32

Thansk for your response. But I followed your advise to set fp16=False and found it does’t works. In addtion, origin model data type is float32.
DU%$F8OWV21A4}6DXSSG93

are there other reasons?

finetuned model data type also is float32.
PTEG2)FZH_KM3XJ@524I{3S