Is there a simple way to compute spectral norm of the weights Wsn? I note that one naive way is to fetch the parameters of a module out and do further processing, or just wrap the module in a forward hooks (i doubt it is okay when doing backward propagation)? can anyone give some help?
Thank you very much for your insights. However, when wrapping it in SGD, I find some problem to be solved. During forward propagation, it will use W instead of Wsn to compute the output, it is problematic because from the algorithm pseudocode, it is using Wsn to compute the output. any ideas?
I see. So the forward is supposed to use normalized weights? You can still do it as an optimizer though. Just store the unnormalized weights in optimizer state dict and in each step update the model parameters to normalized weights.
A small doubt here, Are you adding spectral Norm of the weight as an extra regularizer ? Or Are you embedding it with SGD algorithm itself?
Actually I am trying to see the effect of Spectral norm of weight as an regularizer (like l2 loss). Would your approach be useful in my case?( Since the function calculating spectral norm wouldn’t be differentiable), And when I am adding to the total loss function, I should add something is differentiable.