Difference between accelerate/torch_distributed/deepspeed

Hi, I am new to distributed training and am using huggingface to train large models. I see many options to run distributed training. Can I know what is the difference between the following options:

  1. python train.py .....<ARGS> -
  2. python -m torch.distributed.launch <ARGS>
  3. deepspeed train.py <ARGS>
  4. hf accelerate

I did not expect option 1 to use distributed training. But it even seem to use some sort of torch distributed training? In that case, whats the difference between option 1 and option 2?

Does deepspeed use torch.distributed in the background?

Also, huggingface by default seem to use distributed training using torch? Whats the difference between accelerate?