I think the docs are insufficient. See my questions here: Using Transformers with DistributedDataParallel — any examples?
4 Likes
I think the docs are insufficient. See my questions here: Using Transformers with DistributedDataParallel — any examples?