For the Seq2SeqTrainingArguments class, what happens when I set both adafactor=True and set a learning rate?

Say that I have the following Seq2SeqTrainingArguments class:

    adafactor = True,
    optim = "adafactor",
    learning_rate = 1e-4

In this case, I am not sure if the learning_rate is actually used anywhere. From the Seq2SeqTrainingArguments documentation:

  • learning_rate (float, optional, defaults to 5e-5) — The initial learning rate for AdamW optimizer.

Does this mean that it is completely ignored for Adafactor?

Thank you!