HF accelerate DeepSpeed plugin does not use custom optimizer or scheduler

jpcorb20 · February 28, 2025, 5:06pm

Hello,

I am trying to launch the training of a large model in multi-node/multi-gpu setting with “accelerate” using DeepSpeed plugin (no DS config file) with 8-bit adam and LR cosine annealing scheduler. Yet, deepspeed doesn’t seem to use the 8-bit adam from BnB set in my python script but rather regular AdamW, while the documentation seems to indicate that this should work for custom optimizer/scheduler… Any idea what’s happening here? Is there a specific setup for this?

thanks

jpcorb20 · March 1, 2025, 4:23pm

looks like there is an implementation with the trainer by setting the training argument optim="adam_bnb_8bit" and this way it works … Not sure why the custom instantiation is not working …

system · March 2, 2025, 4:23am

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Setting optimizer parameters with DeepSpeed 🤗Accelerate	0	638	January 22, 2024
How DeepSpeed interacts with Trainer optimizer DeepSpeed	1	1212	October 13, 2021
Besides writing your own training loop, is there any other advantage for using it with deepspeed? 🤗Accelerate	2	606	July 4, 2023
Difference between using the Trainer class vs Accelerate library DeepSpeed	0	921	June 27, 2023
Optimizer got an empty parameter list when using deepspeed Beginners	0	895	October 29, 2021

HF accelerate DeepSpeed plugin does not use custom optimizer or scheduler

Related topics