How to do alternating training, i.e., freeze first net's parameters except for the last layer while training the 2nd network?

galactica147 · September 6, 2017, 4:13am

Hi, I wonder how i could do alternating training, e.g., input -> net1 -> net2 -> output.
Say I have defined net1 to have layers like:

def forward():
  #non-conv layers are not shown here
  input = torch.nn.Conv2d()
  layer2 = torch.nn.Conv2d()
  layer3 = torch.nn.Conv2d()
  output = torch.nn.Conv2d()
  return output

For alternating training, i want to train net1 first, then freeze all params in net1 except for the last conv layer and train net2. When searching around, i put together something like below, but haven’t got it running yet:

net1.train()
net2.train()

optim1=SGD(net1.parameters)
optim2=SGD(net2.parameters)

for i, (data, label) in enumerate(data_loader):
    data = Variable(data.cuda(async=True))
    label = Variable(data.cuda(async=True))
    net1_out = net1(data)
    net1_loss = loss1(net1_out, label)
    optim1.zero_grad()
    net1_loss.backward()
    optim1.step()

    for param in net1.parameters():
         param.require_grad = False
    net1.output.require_grad=True   #error: object has no attribute 'output'
   
    net2_out = net2(net1_out)
    net2_loss = loss2(net2_out, label)
    optim2.zero_grad()
    net2_loss.backward()
    optim2.step()

Question:

how to properly unfreeze the last layer in net1 when training net2? Note the simple network structure here is for simplicity, i wonder if there is a general way of retrieving the last layer in a network, or any particular layer.
Would this work as I expected, i.e., training alternating between net1 and net2?

Thanks!