My code is written in pytorch, thus I use torch.optim.adam as my optimizer. However, I need to do use Adam wright decay with some layer excluded. To be more specifically, I am trying to reproduce this tensor flow code
optimizer = AdamWeightDecayOptimizer(*some parameter setting here,* exclude_from_weight_decay=["LayerNorm", "layer_norm", "bias"])
What should I do to exclude [“LayerNorm”, “layer_norm”, “bias”] in weight decay in pytorch? Could I use tensor flow optimizer in pytorch?
Thank you.