Why arent the shared weights exactly the same?

Shisho_Sama · September 14, 2019, 6:00pm

This is a follow up question from this thread. Basically I shared weights between two layers. but the problem is, when I visualize the weights, they are not exactly the same, one of them is a bit off compared to the other. (looks washed out if you will)
here is the samples I’m talking about :

As you can see, they are nearly identical, expect for the fact that the decoder’s weight seem washed out (indicates more values are either 0 or very close to 0, compared to the encoders weight.)
However, knowing that both encoder and decoder share the same weight, why am I seeing this?
(I trained this on a sparse autoencoder by the way. and the weights are shared like this :

weights = nn.Parameter(torch.randn_like(self.encoder[0].weight))
self.encoder[0].weight.data = weights.clone()
self.decoder[0].weight.data = self.encoder[0].weight.data.t()

What is the reason behind this behavior?
I’d be very grateful to know

Joshua_Clancy · September 17, 2019, 11:11pm

I’m a pytorch noob so take my answer with a grain of salt. But I am working on a similar problem and have just got it working. From your code segment I cannot tell if you have implemented the weight tie in the forward function or not, but that was the problem for me. I originally tied the weights in the init but this did not force the weights to update together, so during learning time, they drifted apart. Instead now I have used functional methods in the decoder section, and directly connected the weights in that forward function… so far this seems to work.

Shisho_Sama · September 18, 2019, 7:55am

Thanks alot good to know. I’ll give that a try and see how it goes:slightly_smiling_face: