I wrapped my model with DDP. However in the past I had issues where due to the model architecture DDP was not hooking to all the tensors and therefore not sharing the gradients as it was supposed to.
Now with a different model and architecture, I am concerned and want to know for sure that DDP is working correctly. Is there a simple way to test this?
I also created a torchviz graph diagram of my model, but I am not sure what I should be looking at. There do seem to be some scatter’s through in there but I do not know for sure if I can tell from the diagram if all is working fine or not.
Torchviz diagram: https://i.imgur.com/htq9TCb.png
So to rephrase? How do I test that DDP is sharing the gradients as it should?