Hi, I wonder how i could do alternating training, e.g., input -> net1 -> net2 -> output.
Say I have defined net1 to have layers like:
def forward():
#non-conv layers are not shown here
input = torch.nn.Conv2d()
layer2 = torch.nn.Conv2d()
layer3 = torch.nn.Conv2d()
output = torch.nn.Conv2d()
return output
For alternating training, i want to train net1 first, then freeze all params in net1 except for the last conv layer and train net2. When searching around, i put together something like below, but haven’t got it running yet:
net1.train()
net2.train()
optim1=SGD(net1.parameters)
optim2=SGD(net2.parameters)
for i, (data, label) in enumerate(data_loader):
data = Variable(data.cuda(async=True))
label = Variable(data.cuda(async=True))
net1_out = net1(data)
net1_loss = loss1(net1_out, label)
optim1.zero_grad()
net1_loss.backward()
optim1.step()
for param in net1.parameters():
param.require_grad = False
net1.output.require_grad=True #error: object has no attribute 'output'
net2_out = net2(net1_out)
net2_loss = loss2(net2_out, label)
optim2.zero_grad()
net2_loss.backward()
optim2.step()
Question:
- how to properly unfreeze the last layer in net1 when training net2? Note the simple network structure here is for simplicity, i wonder if there is a general way of retrieving the last layer in a network, or any particular layer.
- Would this work as I expected, i.e., training alternating between net1 and net2?
Thanks!