Parallelizing huggingface models

akshatkatoch · July 24, 2023, 1:04pm

Hi,

I have 4 gpus each with 24gb of memory and would like to parallelize the gpt neo models (2.7B and 1.3B) and train them on text data I have. I would also like to parallelize and train models like llama that have 7 billion parameters. I am unsure of how to do this in deepspeed since there is a lot of information and not a lot of straightforward implementations. Please let me know of how this can be done.

Thanks

Topic		Replies	Views
Manual pipeline parallelization with DeepSpeed DeepSpeed	0	771	January 7, 2023
Model parallel with deepspeed integration Beginners	0	641	September 14, 2021
Model Parallism DeepSpeed	0	185	April 21, 2024
How to ensure that while running with llama2-70B, we use parallelism? 🤗Optimum	11	1587	August 22, 2023
SFTTrainer Doubling Speed on a Single GPU with DeepSpeed: Proposal for an Update to the Official Documentation and Verification Report DeepSpeed	1	68	March 7, 2025

Parallelizing huggingface models

Related topics