Tensor parallelism has been supported by Pytorch 2.3. Just wonder when Transformers could also officially support this feature, which would be very helpful to handle very large language models.
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Transformer model parallel does not work with Pytorch DDP for multi-node training | 0 | 511 | September 1, 2022 | |
Can we parallelize transformers fine-tuning on a Hadoop cluster? | 0 | 333 | April 7, 2023 | |
Model Parallelism, how to parallelize transformer? | 3 | 12563 | June 18, 2021 | |
Question about supported framework | 2 | 340 | June 18, 2021 | |
When does transformers support pipeline parallelism? | 0 | 183 | October 13, 2023 |