How to share weights in a model and then unshare them later?

I have a model like this

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.layer_a = nn.Linear(1,1, bias=False)
        self.layer_b = nn.Linear(1,1, bias=False)
        
    def forward(self, x):
        x_a = self.layer_a(x)
        x_b = self.layer_b(x)
        
        return (x_a, x_b)

There are two phases of training
Phase 1: self.layer_a and self.layer_b share the same weight.
Phase 2: self.layer_a is frozen, only self.layer_b keeps updating.

To achieve Phase 1, I do something like

model = Net()
model.layer_b.weight = model.layer_a.weight

Then both of them share the same weight.

But the problem is how to freeze model.layer_a.weight without affect model.layer_b.weight in phase 2?

Or it there a better way to a

I would remove nn.Linear altogether, and have something like

w2 = self.weight_b if stage2 else self.weight_a
x_b = x.matmul(w2.t())

another alternative that needs no flag passing:

w2 = self.weight_a + self.weight_b

init weight_b to zero and don’t train it in phase 1, detach or otherwise freeze weight_a in phase 2

Thanks for the suggestion. I was using nn.Linear to illustrate my problem. In reality, my model consists of two deep convolutions networks.

If it is simply a nn.Linear, your suggestions does work. But how about if it is a deep convolutions networks?

Try doing sharing with

with torch.no_grad():
	model.layer_b.weight.data.copy_(model.layer_a.weight.data)

I see. Then when I want to train them separately again, I just need to activate the gradient for layer_b?

Yes, probably :slight_smile: