Tensor parallel in Pytorch 2.3

Tensor parallelism has been supported by Pytorch 2.3. Just wonder when Transformers could also officially support this feature, which would be very helpful to handle very large language models.