Symmetric parametrization

Hello,

From the PyTorch parametrization tutorial the symmetric parametrization is implemented as follow:

class Symmetric(nn.Module):
    def forward(self, X):
        return X.triu() + X.triu(1).transpose(-1, -2)

Is there a reason (especially for gradient computation) why this would be preferred to:

class Symmetric(nn.Module):
    def forward(self, X):
        return X + X.transpose(-1, -2)

Hi Octave!

I can’t think of any compelling reason – both should work.

One thing to note is that the in the original version, the non-symmetric X matrix contains
ignored elements – those in the lower triangle – while your version contains redundant
elements (that aren’t ignored). Arguably the former could be better.

In your case, you add two elements together – one from the upper triangle and one from
the lower triangle. So if you (or the optimization step) adds a large value to the upper
element and subtracts the same value from the lower element, it would have no effect.
This means that you have a directions (a “mode”) where those elements could drift off
to +-inf, potentially causing problems. (The ignored values in the original approach
could also potentially drift off, but that’s less likely to happen and less likely to cause
problems if they do.)

Best.

K. Frank

Hello K. Frank,

Thank you for all the details on how redundancy could cause problems!