The “u” and “v” are treated as constants while doing backprop of loss with respect to weights (W).
Since the spectral norm
sigma(W) = u^T W v
then to estimate the derivatives of sigma(W) with respect to W, lets say I want to backprop through u and v too (as u and v are both functions of W). If I remove “with torch.no_grad()” will it compute the gradient of weights taking into account that “u” and "v"are also function of W ?