Hello,
I’m trying to use DeepSpeed with Transformers, and I see there are two DeepSpeed integrations documented on HF:
(a) Transformers’ DeepSpeed integration: DeepSpeed Integration
(b) Accelerate’s DeepSpeed integration: DeepSpeed
However, I’m a bit confused by these two.
They have separate documentations, but are they really two completely separate integrations?
After examining the codes, I’ve realized that Accelerator._prepare_deepspeed
calls deepspeed.initialize
while there is no calling of deepspeed.initialize
on Transformers’ end, which seems to be contradicting with documentation (a) as “Trainer Deepspeed Integration” shouldn’t require the user to call deepspeed.initialize
manually.
Could someone clarify it for me, please?
Thank you!