I went through this tutorial here:
https://pytorch.org/tutorials/intermediate/ddp_tutorial.html
My setup is a single multi-gpu machine.
I am wondering, does the optimizer need to be changed in anyway to account for the
multiple processes?
Also should optimizer.step() only be called by the process at rank 0?