Hi. I am practicing wit this awesome library and was training some object detection models on custom datasets.
I save model each epoch if it is best model + last model. But saving the model, also if save_only_model flag is True take so much time it makes training unpractical even for very small datasets.
Is there a way to save faster of to save stilized models to then load?
I thank you for you availability in advance. Code of trainer below and output below.
Best regards to all.
training_args = TrainingArguments(
output_dir="weights",
num_train_epochs=20,
max_grad_norm=0.01,
learning_rate=5e-5,
warmup_steps=300,
per_device_train_batch_size=2,
#gradient_accumulation_steps=4,
dataloader_num_workers=0,
metric_for_best_model="loss", # eval_map
greater_is_better=False, # True if metric_for_best_model="eval_map"
load_best_model_at_end=True,
eval_strategy="epoch",
save_strategy="epoch",
save_total_limit=2,
save_only_model=True,
remove_unused_columns=False,
eval_do_concat_batches=False,
)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=pytorch_dataset_train,
eval_dataset=pytorch_dataset_valid,
tokenizer=processor,
data_collator=collate_fn,
compute_metrics=eval_compute_metrics_fn,
)
trainer.train()
Output
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
25%|██▌ | 500/1972 [02:58<07:25, 3.30it/s]{'loss': 129.9299, 'grad_norm': 232.07464599609375, 'learning_rate': 4.401913875598087e-05, 'epoch': 0.51}
50%|█████ | 986/1972 [05:38<04:52, 3.37it/s]
0%| | 0/28 [00:00<?, ?it/s]
7%|▋ | 2/28 [00:00<00:05, 5.13it/s]
11%|█ | 3/28 [00:00<00:06, 3.66it/s]
14%|█▍ | 4/28 [00:01<00:07, 3.19it/s]
18%|█▊ | 5/28 [00:01<00:07, 2.97it/s]
21%|██▏ | 6/28 [00:01<00:07, 2.84it/s]
25%|██▌ | 7/28 [00:02<00:07, 2.76it/s]
29%|██▊ | 8/28 [00:02<00:07, 2.71it/s]
32%|███▏ | 9/28 [00:03<00:07, 2.66it/s]
36%|███▌ | 10/28 [00:03<00:06, 2.63it/s]
39%|███▉ | 11/28 [00:03<00:06, 2.62it/s]
43%|████▎ | 12/28 [00:04<00:06, 2.59it/s]
46%|████▋ | 13/28 [00:04<00:05, 2.55it/s]
50%|█████ | 14/28 [00:05<00:05, 2.55it/s]
54%|█████▎ | 15/28 [00:05<00:05, 2.58it/s]
57%|█████▋ | 16/28 [00:05<00:04, 2.59it/s]
61%|██████ | 17/28 [00:06<00:04, 2.59it/s]
64%|██████▍ | 18/28 [00:06<00:03, 2.61it/s]
68%|██████▊ | 19/28 [00:06<00:03, 2.61it/s]
71%|███████▏ | 20/28 [00:07<00:03, 2.62it/s]
75%|███████▌ | 21/28 [00:07<00:02, 2.62it/s]
79%|███████▊ | 22/28 [00:08<00:02, 2.61it/s]
82%|████████▏ | 23/28 [00:08<00:01, 2.62it/s]
86%|████████▌ | 24/28 [00:08<00:01, 2.63it/s]
89%|████████▉ | 25/28 [00:09<00:01, 2.62it/s]
93%|█████████▎| 26/28 [00:09<00:00, 2.62it/s]
96%|█████████▋| 27/28 [00:09<00:00, 2.78it/s]
100%|██████████| 28/28 [00:10<00:00, 3.53it/s]