I have an application that uses the resnet to implement MCTS as in the “Mastering the game of go without human knowledge”, paper from DeepMind.
In that paper, they took the l2 norm of the weights of the residual tower block. In pytorch, the weights are arranged module wise and are not necessarily equal in size to one another from one module to the next. The l2 norm is not easily computable without having to broadcast the weights of smaller modules to those of bigger sized arrays.
Before implementing this, I wanted to find out if there is a default method that allows to gather the norm of the weights. Would appreciate your response.