Hi,
I’m getting very different results switching from the old spectral_norm to the new parametrization.
Here is the output features of unet using the old torch.nn.utils.spectral_norm
:
tensor([[[[-0.0099, -0.0102, -0.0013, ..., -0.0101, -0.0100, -0.0076],
[-0.0214, -0.0407, -0.0349, ..., -0.0185, -0.0293, -0.0054],
[-0.0211, -0.0214, -0.0356, ..., -0.0229, -0.0362, -0.0165],
...,
[-0.0133, -0.0229, -0.0189, ..., -0.0240, -0.0370, -0.0003],
[-0.0028, -0.0027, -0.0099, ..., -0.0031, -0.0209, -0.0099],
[-0.0068, -0.0027, -0.0033, ..., -0.0076, 0.0003, 0.0011]]],
And here is the output features using the new torch.nn.utils.parametrizations.spectral_norm
:
tensor([[[[-6.4015e-03, -5.9812e-03, -5.5848e-04, ..., -6.6797e-03,
-5.4990e-03, -4.3101e-03],
[-1.1050e-02, -2.3454e-02, -2.0890e-02, ..., -1.0111e-02,
-1.7187e-02, -2.5395e-03],
[-1.0570e-02, -9.7215e-03, -1.9345e-02, ..., -1.3104e-02,
-2.1078e-02, -8.5044e-03],
...,
[-7.6973e-03, -1.3273e-02, -9.9956e-03, ..., -1.3413e-02,
-2.1045e-02, 1.4347e-03],
[-2.9391e-04, 5.1208e-05, -4.4562e-03, ..., -7.5174e-04,
-1.0988e-02, -5.6328e-03],
[-3.7524e-03, -8.9492e-04, -8.4569e-04, ..., -3.2808e-03,
1.0416e-03, 1.3095e-03]]],
This is a bit-wise deterministic run, by the way.
Here’s a naive way of showing that the old function produces less extremal values. Min/Max of tensor outputs from the last 10 iters of 100 iters:
Old:
min | max |
---|---|
-4.7022 | -0.1812 |
-4.4896 | -0.1821 |
-3.4386 | 23.3624 |
-4.4896 | -0.1821 |
-4.5798 | -0.1844 |
-1.1253 | 19.4105 |
-4.5797 | -0.1844 |
-4.6552 | -0.1855 |
0.2628 | 22.2968 |
-4.6551 | -0.1855 |
Parametrization:
min | max |
---|---|
-4.8454 | -0.1950 |
-4.5237 | -0.1968 |
-2.6446 | 32.7689 |
-4.5234 | -0.1968 |
-4.7137 | -0.2026 |
0.1529 | 27.5732 |
-4.7134 | -0.2026 |
-4.8490 | -0.2062 |
0.4218 | 31.8953 |
-4.8486 | -0.2062 |
So it looks like on the new function the max values are about 30% higher than on the old one.
Is there a reason for that, or am I using it wrong? Thanks in advance.