Hi,
I am trying to train a Siamese network and want to incorporate adaptive learning rate during training.
So I have a SiameseTrainer which is a subclass of Trainer class. My question is am I doing something wrong that I get the error
TypeError: Object of type Tensor is not JSON serializable
in this line of my code:
train_result = trainer.train(resume_from_checkpoint=checkpoint)
My code runs for few epochs and then crashes. So what I am doing is below:
def compute_metrics(p: EvalPrediction):
correct_predictions = p.predictions[1] if isinstance(p.predictions, tuple) else p.predictions
labels = p.label_ids
return {
"MSE": np.mean((correct_predictions-labels)**2)
}
optimizer = Adafactor(
model.parameters(),
scale_parameter=True, # If True, learning rate is scaled by root mean square
)
lr_scheduler = AdafactorSchedule(optimizer)
trainer = SiameseTrainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
data_collator=data_collator,
compute_metrics=compute_metrics,
tokenizer=tokenizer,
callbacks=[EarlyStoppingCallback(early_stopping_patience=2)],
optimizers=(optimizer, lr_scheduler)
)
When I comment out the optimizers=(optimizer, lr_scheduler) parameter and run the code, it does not crash in the middle of epochs.
Appreciate any help!