Tensorboard support when using optimizer with 2 separate learning rates

Hi all, I want to fine-tune a model, Wav2Vec2ForCTC, but I am attempting to use 2 different learning rates for two parts of the model.

After extracting the 2 different parameter groups, I am defining my own optimizer as

optim = torch.optim.AdamW([
{‘params’: model2.wav2vec2.feature_extractor.parameters(), ‘lr’: 2e-5},
{‘params’: base_params}
], lr = 6e-4, weight_decay = 0)

and my own learning rate scheduler (which is simply the default one of the trainer) as

    lr_scheduler = get_scheduler(
        optimizer = optim,
        num_warmup_steps = warmup_steps,
        num_training_steps = total_training_steps,

Which I am then passing to the trainer as trainer = Trainer(...,optimizers = (optim, lr_scheduler)).

I guess that this learning rate scheduler will affect both groups, won’t it? Meaning that both my learning rates will start from 0 to their respective max values (passed in the optimizer) and then they will follow a linear decay route.

It seems to be working so far, but when I am using Tensorboard to check its plots, the learning_rate plot it offers only shows one learning rate. Shouldn’t it be showing the evolution of both groups’ learning rates? Am I missing something? Not only that, but when I am opening the trainer_state.json file saved in the running model’s checkpoints, it only shows one LR again. Is this a sign that my 2 learning rates approach is not working? Thanks in advance.