Add multiple FC layers in parallel

kl_divergence · June 1, 2018, 7:04am

I am using pretrained ResNet 50, I am performing finetuning, My code is as follows:

class ResNet50(nn.Module):
    def __init__(self,num_classes,loss={'xent'},**kwargs):
        super(ResNet50,self).__init__()
        self.loss = loss
        resnet50 = torchvision.models.resnet50(pretrained=True)
        num_ftrs = resnet50.fc.in_features
        self.base= nn.Sequential(*list(resnet50.children())[:-2])
        self.fc1 = nn.Linear(2048,num_classes)
        self.fc2 = nn.Linear(2048, num_classes)


    def forward(self,x):
        x = self.base(x)
        x = F.avg_pool2d(x,x.size()[2:])
        f = x.view(x.size(0),-1)
        if not self.training:
            return f
        y = self.classifier(f)

I want to add multiple fc layers in parallel after ResNet

How can I add multiple FC layers in parallel ?,

justusschock · June 1, 2018, 9:02am

To answer your first question I removed the unneccessary code (loss etc) to get a minimum example. You could simply calculate the fc outputs one after the other (as multiprocessing on GPUs can be hard to implement) and later on return a dictionary (or a list or whatever you want).

class ResNet50(nn.Module):
    def __init__(self,num_classes,num_fcs=3, loss={'xent'},**kwargs):
        super(ResNet50,self).__init__()
        self.loss = loss
        resnet50 = torchvision.models.resnet50(pretrained=True)
        self.base= nn.Sequential(*list(resnet50.children())[:-2])
        self.num_fcs = num_fcs
        for i in range(num_fcs):
            setattr(self, "fc%d" % i, nn.Linear(2048, num_classes))


    def forward(self,x):
        x = self.base(x)
        x = F.avg_pool2d(x,x.size()[2:])
        f = x.view(x.size(0),-1)

        clf_outputs = {}
        for i in range(self.num_fcs):
            clf_outputs["fc%d" % i] = getattr(self, "fc%d" % i)(f)

        return clf_outputs

For your second question you could simply try it like this:

# create class instance
model = ResNet50(5)

# freeze layer (can be done with every fc layer)
model.fc0.train(False)

kl_divergence · June 1, 2018, 9:36am

What does 5 signify here in ResNet50(5) ?

justusschock · June 1, 2018, 9:38am

If you look at the class definition there has to be one positional argument specifying the number of classes (5 was just an arbitrary value to create an instance).

kl_divergence · June 1, 2018, 9:40am

Thanks a lot! How do i manage y in this case (from my_code) ?

justusschock · June 1, 2018, 9:42am

I don’t know, what y is supposed to be. In your code is no definition of self.classifier.

If you defined it somewhere else you could simply return it together with the dictionary

kl_divergence · June 1, 2018, 9:43am

self.classifier = nn.Linear(2048,num_classes) . Missed it

justusschock · June 1, 2018, 9:47am

def forward(self,x):
        x = self.base(x)
        x = F.avg_pool2d(x,x.size()[2:])
        f = x.view(x.size(0),-1)

        clf_outputs = {}
        for i in range(self.num_fcs):
            clf_outputs["fc%d" % i] = getattr(self, "fc%d" % i)(f)

        clf_outputs["y"] = self.classifier(f)

        return clf_outputs

kl_divergence · June 1, 2018, 9:47am

The thing is I’ll freeze one fc layer, and then train the model with second fc layer. But in second phase, I’ll train the model with both fc layers (unfreeze) on different dataset, so how will i compute loss pertaining to two fc layers unlike the code which I provided.

justusschock · June 1, 2018, 9:48am

you could simply calculate separate losses for each of the output and add them up (or calculate the mean) and then call backward on the total loss

kl_divergence · June 1, 2018, 9:48am

I have to remove self.classifier because now i have multiple fc layers

kl_divergence · June 1, 2018, 9:55am

When I’ll join train (fc0 and fc1 turned on), how will i manage loss then ?

justusschock · June 1, 2018, 9:57am

calculate a mean/sum of both separate losses and call backward() on it. Autograd should handle the rest for you since only the parameters which are involved in the forward pass will be updated for each part of the loss.

kl_divergence · June 1, 2018, 10:01am

You mean this ?

def forward(self,x):
        x = self.base(x)
        x = F.avg_pool2d(x,x.size()[2:])
        f = x.view(x.size(0),-1)
        clf_outputs = {}
        for i in range(self.num_fcs):
            clf_outputs["fc%d" %i] = getattr(self, "fc%d" %i)(f)

        clf_outputs["y1"] = self.fc0(f)
        clf_outputs=["y2"] = self.fc1(f)
        l = (y1+y2)/2

        if self.loss == {'xent'}:
            return l
        elif self.loss == {'xent','htri'}:
            return l,f
        elif self.loss == {'cent'}:
            return l,f
        else:
            raise KeyError("Unsupported loss:{}".format(self.loss))
        l.backward()
        return clf_outputs

justusschock · June 1, 2018, 10:14am

First of all:

you don’t need this part as

also stores the same results in clf_outputs["fc0"] and clf_outputs["fc1"] instead of clf_outputs["y1"] and clf_outputs["y2"]

Second: thats not what I meant. you simply sould use your forward function like

def forward(self,x):
        x = self.base(x)
        x = F.avg_pool2d(x,x.size()[2:])
        f = x.view(x.size(0),-1)
        clf_outputs = {}
        for i in range(self.num_fcs):
            clf_outputs["fc%d" %i] = getattr(self, "fc%d" %i)(f)

        if self.loss == {'xent'}:
            return clf_outputs
        elif self.loss == {'xent','htri'}:
            return clf_outputs,f
        elif self.loss == {'cent'}:
            return clf_outputs,f
        else:
            raise KeyError("Unsupported loss:{}".format(self.loss))

after you created a model instance with model = ResNet50(10) you cann do some predictions with clf_outputs, f = model(data_tensor) and later on calculate a loss for each of the values in clf_outputs:

In the following snippet I assume you have a list of targets (one per fc layer)!

model = ResNet(10, num_fcs=2)
optim = SGD(model.parameters())
clf_outputs, f = model(data_tensor)

# for simplicity I'm using MSE-Loss but you could simply use any other loss function as well
loss_fn = MSELoss()
loss_value = 0
for k, v in clf_outputs.items():
    _curr_loss = loss_fn(v, target_list[int(k.replace("fc", ""))])
    loss_val = loss_val + _curr_loss

optim.zero_grad()
loss_value.backward()
optim.step()

kl_divergence · June 1, 2018, 12:08pm

When I freeze FC layer 2

resnet50.fc1.train(False)

, it provides me with this:

AttributeError: 'ResNet' object has no attribute 'fc1'

justusschock · June 1, 2018, 12:17pm

how did you create your model?

kl_divergence · June 1, 2018, 12:19pm

I am using parse arguments

parser.add_argument('-a', '--arch', type=str, default='resnet50', choices=models.get_names())

Then this is the relevant part of train function

def train(epoch, model, criterion_xent, criterion_htri, optimizer, trainloader, use_gpu):
    losses = AverageMeter()
    batch_time = AverageMeter()
    data_time = AverageMeter()

    model.train()

    for batch_idx, (imgs, pids, _) in enumerate(trainloader):
        if use_gpu:
            imgs, pids = imgs.cuda(), pids.cuda()
        
        outputs, features = model(imgs)
        if args.htri_only:
            if isinstance(features, tuple):
                loss = DeepSupervision(criterion_htri, features, pids)
            else:
                loss = criterion_htri(features, pids)
        else:
            if isinstance(outputs, tuple):
                xent_loss = DeepSupervision(criterion_xent, outputs, pids)
            else:
                xent_loss = criterion_xent(outputs, pids)
                  
            
         loss = xent_loss + htri_loss
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

Then i am using this

model = models.init_model(name=args.arch, num_classes=dataset.num_train_pids, loss={'xent', 'htri'})

justusschock · June 1, 2018, 12:23pm

and what does your init_model function look like?

kl_divergence · June 1, 2018, 12:25pm

def init_model(name, *args, **kwargs):
    if name not in __factory.keys():
        raise KeyError("Unknown model: {}".format(name))
    return __factory[name](*args, **kwargs)