Hello there! I’m experimenting with a network where I reuse several layers that are initialized in the init portion of my network, in my forward pass, so that I can tie the weights together. However, when I use a “parameter counter” like torchsummary, it shows me that I have an identical number of parameters to a net with a unique module for each layer.
My question is, is reusing a layer module enough to tie the weights? This to me sounds like a stupid question because how would backprop have additional parameters if an additional module doesn’t exist, but I was thinking maybe reusing the layer creates some kind of shadow layer that caches the parameters for each individual layer.