Add multiple FC layers in parallel

have you modified the class definition I wrote above? Because otherwise the code should have to work…

To make the post readable for everyone:

clf_outputs does not need to be a tensor. You are getting a dict as output (which is intended this way!). You have to apply every function which accepts only tensors (as I don’t know what you are trying to do) to EVERY item in this dict like this:

for k, v in clf_outputs.items():
    pytorch_fn(v)

So is that possible to do transfer learning with conv layer as feature extractor this way?

model = ResNet50(5)
model.train(False)
model.fc.train(True)

Yes this should be possible. Depending on your task you should be able to reuse up to 99% of the code

You have to unpack your tensors inside the tuple and then call data on top of the unpacked tensors

1 Like

The problem is

qf.append(features) #qf is an empty list

I’ve tried unpacking tuple also, but I also have to append it. I can’t append tensors with different dimensions

That’s true but why do you even want to do this? I assume that one of the values is your classifier output and the other one is your feature vector?

As per my model, above:

class ResNet50(nn.Module):
    def __init__(self,num_classes,loss={'htri'},**kwargs):
        super(ResNet50,self).__init__()
        self.loss = loss
        resnet50 = torchvision.models.resnet50(pretrained=True)
        self.base= nn.Sequential(*list(resnet50.children())[:-2])
        num_fcs = 2
        for i in range(num_fcs):
           setattr(self, "fc%d" %i, nn.Linear(2048,num_classes))
        #resnet50.fc0.train(False)
        resnet50.fc1.train(False)

I think features[1] is the classifier output and would be used for calculating loss, would features[0] would be used ?

In your model above features[0] is the classifier output while features[1] are the features which have been extracted from the resnet (usually you don’t need them for calculating the loss)

So features[0] won’t be used anywhere right ? Not in finetuning as well ?

No, features[0] will be used everywhere and features[1] won’t be used anywhere

1 Like

Thanks for the clarity

I think this could be due to the fact that you train different FC layers (with different loss functions) on the same feature extractor. Have you tried freezing the resnet and only training the FC layers?

Also in your code you posted you don’t need the second FC layer as only the first FC layer’s output is returned and used for loss calculation.

To be honest I think your whole network implementation and training idea is very strange. Do you simply want to finetune the resnet?

Yes, but in different ways. It’s a different learning method altogether

With the network definition as

class ResNet50(nn.Module):
    def __init__(self,num_classes,num_fcs=3, loss={'xent'},**kwargs):
        super(ResNet50,self).__init__()
        self.loss = loss
        resnet50 = torchvision.models.resnet50(pretrained=True)
        self.base= nn.Sequential(*list(resnet50.children())[:-2])
        self.num_fcs = num_fcs
        for i in range(num_fcs):
            setattr(self, "fc%d" % i, nn.Linear(2048, num_classes))


    def forward(self,x):
        x = self.base(x)
        x = F.avg_pool2d(x,x.size()[2:])
        f = x.view(x.size(0),-1)

        clf_outputs = {}
        for i in range(self.num_fcs):
            clf_outputs["fc%d" % i] = getattr(self, "fc%d" % i)(f)

        return clf_outputs

the training code should look as:

if torch.cuda.is_available():
    device = torch.device("cuda")
else:
    device = torch.device("cpu")

dataloader = YOUR CODE TO CREATE A DATALOADER
model = ResNet50(num_classes=10, num_fcs=2).to(device)
optim = torch.optim.Adam(model.parameters())

num_epochs = 100 # set your custom number here
switch_fc_epoch = 50 # set your custom number here


# I used the following loss functions as examples. You have to replace them by your own functions
loss_fc0 = torch.nn.MSELoss()
loss_fc1 = torch.nn.L1Loss()

for epoch in range(num_epochs):

    if epoch < switch_fc_epoch:
        model.fc0.train(True)
        model.fc1.train(False)
        output_fc = "fc0"
        loss_fn = loss_fc0
    else:
        model.fc0.train(False)
        model.fc1.train(True)
        output_fc = "fc1"
        loss_fn = loss_fc1

        # eventually you want to freeze the resnet structure. If you want to do so, you should uncomment the following line
        #model.base.train(False)

    for batch, target in dataloader:
        batch, target = batch.to(device), target.to(device)

        clf_outputs = model(batch)
        
        optim.zero_grad()
        loss_value = loss_fn(clf_outputs["output_fc"], target)
        loss_value.backward()
        optim.step()
1 Like

Thanks a lot for this, but I’ll be finetuning on a different dataset.

then you simply have to create separate loaders in the if statement.

From skimming your code it looks okay. Now you simply need to integrate my model definition and the way I select the used FC layer in each epoch.