I’m facing something similar: passing optim=“adafactor”, and with or without setting a learning_rate (or letting the default being set), each training phase shows a learning rate of “0.0” so my model is never updating.
I’m facing something similar: passing optim=“adafactor”, and with or without setting a learning_rate (or letting the default being set), each training phase shows a learning rate of “0.0” so my model is never updating.