Hi team,
I have noticed the PyTorch new feature DistributedTensor, it provides a new way to build a large model. And the new feature TorchDyname of PyTorch 2.0 provides a new to capture graphs. So are there any plans for PyTorch to capture the whole graph of a single GPU and automatically translate it to distributed mode with model parallel and pipeline parallel based on the GPU resources?
Thanks, any response will be appreciated.