Hi
I have a question regarding create_optimizer function.
num_epochs = 5
num_train_steps = len(tf_train_dataset) * num_epochs
optimizer, schedule = create_optimizer(
init_lr = 3e-5,
num_warmup_steps=int(num_train_steps*0.06),
num_train_steps=num_train_steps,
weight_decay_rate=0.01
)
model_0.compile(optimizer=optimizer)
If I use it like that will learning rate be decaying over time or do i need to do it like that???
model_0.compile(optimizer=optimizer(learning_rate=schedule))