Training a part of a net

a-parida12 · February 17, 2019, 2:29am

My Network looks like this:

class Network(nn.Module):
     def  _init_(self):
               super.__init__()
               self.encoder=vgg16()
               l7 = [('fc7', nn.Conv2d(num_classes*512, num_classes, 1))]
               l8=[('fc8',nn.Conv2d(num_classes,num_classes,1))]
               self.conv1=nn.Sequential(OrderedDict(l7))
               self.conv2=nn.Sequential(OrderedDict(l8))


      def(self,input,warmup=True):
              x=self.encoder(input)
              x=conv1(x)
              if warmup==True:
                    return x

             x=conv2(x)
             return x

I want to only train a certain part of the network in the “warm up” time and this is how I update the weights:

NetworkTrg=Network().cuda()
learned_params = filter(lambda p: p.requires_grad, NetworkTrg.parameters())
opt= optim.SGD(learned_params, lr=1e-3)

for inputs, target in data:
     predict=NetworkTrg(Variable(inputs).cuda(),SupportPrototypes)
     loss=LossFn(predict,target)
     loss.backward() 
     opt.step()
     opt.zero_grad()

I want to know if it is valid to train a part of the network as I do. If not how do I modify the current setup to allow the training part wise.

ptrblck · February 17, 2019, 12:24pm

It should generally be working. However for most network architectures you’ll get into trouble regarding the shapes of the outputs of your intermediate layers.
In the Inception paper they added an auxiliary path so that the output shapes were reasonable to compute the loss.
If you are just dealing with a fully convolutional model without changing the spatial size too much (as shown in your example), it should be alright.