When I use Trainer API to train the GLM Model and save this model，I find memory of the finetuned model is twice the size of the original model. What is the reason for this?

taofennanhai · March 21, 2023, 2:33pm

this is origin model,it occupies 678MB. However, finetuned model occupies 1.33GB

TrainingArguments Setting

Prepare the trainer and start training

training_args = TrainingArguments(
        output_dir=output_dir,
        num_train_epochs=2,
        per_device_train_batch_size=1,
        per_device_eval_batch_size=2,
        eval_steps=10,
        save_steps=10,
        warmup_steps=100,
        learning_rate=1e-5,
        fp16=True,
        do_train=True,
        do_eval=True,
        save_strategy='steps',
        save_total_limit=3,
        evaluation_strategy='steps',
        load_best_model_at_end=True,
        logging_steps=50,
        logging_dir='./logs')

trainer = MyTrainer(
    model,
    training_args,
    train_dataset=train_dataloader,
    eval_dataset=dev_dataloader,
    data_collator=collate_fn,
    tokenizer=tokenizer
    )  # 在定义了tokenizer之后，这里的data_collator

Thanks for answering

taofennanhai · March 21, 2023, 2:35pm

finetuned model occupies 1.33GB

sgugger · March 21, 2023, 5:15pm

That’s probably because the original model was saved in float16 or bfloat16. But you can’t train in those precisions (fp16=True uses mixed precision training), so the model saved by the Trainer is in float32

taofennanhai · March 22, 2023, 2:12am

Thansk for your response. But I followed your advise to set fp16=False and found it does’t works. In addtion, origin model data type is float32.
DU%$F8OWV21A4}6DXSSG93

taofennanhai · March 22, 2023, 2:15am

are there other reasons？

taofennanhai · March 22, 2023, 2:20am

finetuned model data type also is float32.
$PTEG2)FZH_KM3XJ@524I{3S$

Topic		Replies	Views
Finetuning a pre-trained model Intermediate	0	54	August 21, 2024
Memory footprint in mixed precision training? Beginners	1	813	June 29, 2023
Finetuning neox 20b, why is resulting model so small Beginners	1	293	September 19, 2022
Why is uploaded model twice the size of actual model? Intermediate	6	2668	June 12, 2022
Load a model saved using trainer API Beginners	0	561	April 11, 2022

When I use Trainer API to train the GLM Model and save this model，I find memory of the finetuned model is twice the size of the original model. What is the reason for this?

Prepare the trainer and start training

Related topics