I have a model like this
self.layer_a = nn.Linear(1,1, bias=False)
self.layer_b = nn.Linear(1,1, bias=False)
def forward(self, x):
x_a = self.layer_a(x)
x_b = self.layer_b(x)
return (x_a, x_b)
There are two phases of training
self.layer_b share the same weight.
self.layer_a is frozen, only
self.layer_b keeps updating.
To achieve Phase 1, I do something like
model = Net()
model.layer_b.weight = model.layer_a.weight
Then both of them share the same weight.
But the problem is how to freeze
model.layer_a.weight without affect
model.layer_b.weight in phase 2?
Or it there a better way to a
I would remove nn.Linear altogether, and have something like
w2 = self.weight_b if stage2 else self.weight_a
x_b = x.matmul(w2.t())
another alternative that needs no flag passing:
w2 = self.weight_a + self.weight_b
init weight_b to zero and don’t train it in phase 1, detach or otherwise freeze weight_a in phase 2
Thanks for the suggestion. I was using
nn.Linear to illustrate my problem. In reality, my model consists of two deep convolutions networks.
If it is simply a
nn.Linear, your suggestions does work. But how about if it is a deep convolutions networks?
Try doing sharing with
I see. Then when I want to train them separately again, I just need to activate the gradient for