Done anyone know how to add DistributedDataParallel() for the transformer model?
Transformer model: https://pytorch.org/tutorials/beginner/transformer_tutorial.html
Done anyone know how to add DistributedDataParallel() for the transformer model?
Transformer model: https://pytorch.org/tutorials/beginner/transformer_tutorial.html