Split single model in multiple gpus

This makes sure that the outputs will be gathered on GPU0, which will calculate the loss and scatter it to the replicas again.
The general method is beautifully explained in this blog post.

4 Likes