Has anyone by chance implemented L^2-SP regularization for the Adam optimizer?
I want to avoid reinventing the wheel, but I believe this would require a custom version of the AdamW
class.
Has anyone by chance implemented L^2-SP regularization for the Adam optimizer?
I want to avoid reinventing the wheel, but I believe this would require a custom version of the AdamW
class.