Hi, I wonder how i could do alternating training, e.g., input -> net1 -> net2 -> output.
Say I have defined net1 to have layers like:
def forward(): #non-conv layers are not shown here input = torch.nn.Conv2d() layer2 = torch.nn.Conv2d() layer3 = torch.nn.Conv2d() output = torch.nn.Conv2d() return output
For alternating training, i want to train net1 first, then freeze all params in net1 except for the last conv layer and train net2. When searching around, i put together something like below, but haven’t got it running yet:
net1.train() net2.train() optim1=SGD(net1.parameters) optim2=SGD(net2.parameters) for i, (data, label) in enumerate(data_loader): data = Variable(data.cuda(async=True)) label = Variable(data.cuda(async=True)) net1_out = net1(data) net1_loss = loss1(net1_out, label) optim1.zero_grad() net1_loss.backward() optim1.step() for param in net1.parameters(): param.require_grad = False net1.output.require_grad=True #error: object has no attribute 'output' net2_out = net2(net1_out) net2_loss = loss2(net2_out, label) optim2.zero_grad() net2_loss.backward() optim2.step()
- how to properly unfreeze the last layer in net1 when training net2? Note the simple network structure here is for simplicity, i wonder if there is a general way of retrieving the last layer in a network, or any particular layer.
- Would this work as I expected, i.e., training alternating between net1 and net2?