Calling a layer multiple times will produce the same weights?

XavierXiao · November 6, 2018, 9:16pm

Hi, I am not sure if we call a layer define in the __init__ for multiple times, do they share weights in the traning? For example, we have a function fc1 defined as:

def __init__(self):
        super(Net, self).__init__()
        self.fc1 = nn.Linear(10,10)
        ...

Then I called it multiple times:

def forward(self, x1,x2):
        x1 = self.fc1(x1)
        x2 = self.fc1(x2)
        ...

I wonder during training, whether or not these layers maintain the same weights? If not, how do we make them share weights? Thanks a lot!

ptrblck · November 6, 2018, 10:01pm

Yes, as you call the same layer, the same underlying parameters (weights and bias) will be used for the computation.

bhushans23 · November 6, 2018, 10:02pm

Similar question if you want to read more How to create model with sharing weight?

Hritik_Goyal · March 8, 2024, 12:45pm

I am also trying to do the same for LLMs but when I use the same module two times in forward pass, the shared memory in the GPUs just gets doubled. If they were using same shared parameter then it would not happen right??

ptrblck · March 8, 2024, 1:25pm

No, that not correct since forward activations will be stored if these are needed for the gradient computation in the backward pass. Reusing parameters does not change this behavior.