How to use WeightNorm with VGG architecture?

Hi there,

I’d like to use torch.nn.utils.weight_norm on a VGG net but it’s unclear to me from the docs how exactly to add the hooks. Right now, I do this:

def make_layers(cfg, batch_norm=False,centering=False,normalize_std=False,fixed_std=False, weightnorm=False):
    layers = []
    in_channels = 3
    for v in cfg:
        if v == 'M':
            layers += [nn.MaxPool2d(kernel_size=2, stride=2)]
        else:
            conv2d = nn.Conv2d(in_channels, v, kernel_size=3, padding=1)
            layers += [nn.utils.weight_norm(conv2d), nn.ReLU(inplace=True)]
            in_channels = v
    return nn.Sequential(*layers)

But when running SGD I see absolutely no difference to the vanilla net (exact same loss values over time) and as a matter of fact the model weights in the final classification layers (no WN added) are exactly the same after any k>1 iterations for both networks. How is that possible?