Training with symmetric matrix

smonsays · May 1, 2019, 11:18am

I’m trying to train a model that uses a symmetric matrix for the linear layer and wonder how to efficiently implement the symmetric matrix in pytorch.

I have seen this approach, but I think it does not fulfill my needs since it introduces more trainable parameters than necessary, namely features * features parameters instead of (features * (features+1) ) / 2.

I came up with the following solution:

features = 3
num_weights = features * (features+1) // 2
weights = torch.randn(num_weights)

tri_mat = torch.zeros(features,features)
tri_idx = torch.tril_indices(features,features)
tri_mat[tri_idx[0], tri_idx[1]] = weights

symmetric_weight = torch.tril(tri_mat) + torch.tril(tri_mat, -1).t()

However, I suspect creating the temporary tri_mat could be avoided. Any ideas on how to do this more elegantly/efficiently?

rpfeynman · May 1, 2019, 1:48pm

I would enforce it as:

latent_weights = torch.FloatTensor([...])
weights = (latent_weights + torch.transpose(latent_weights))/2

smonsays · May 1, 2019, 4:41pm

This also introduces features * features tunable parameters instead of (features * (features+1) ) / 2 tunable parameters, right? That is what I am trying to avoid here.

arnavs · September 3, 2020, 6:35am

@smonsays I’m running into exactly this problem now. Did you ever figure out how to do it without creating the temporary?

jasperhyp · July 28, 2022, 4:44am

It’s probably too late but one simple way is W.triu() + W.triu(1).transpose(-1, -2). This way you only use half of the parameters.