How to modify the final FC layer based on the torch.model

My torch==0.1.9 while torchvision==0.1.7, but the APIs of torchvision.models.vgg19() seems is different from the docs. Here is the error.It’s the version problem?
I followed the solution from Pre-trained VGG16, but it also has problem.

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-9-cbf958b7f4be> in <module>()
      3 import torch.nn as nn
      4 import torch.nn.functional as F
----> 5 model = torchvision.models.vgg19(pretrained=True)
      6 for param in model.parameters():
      7     param.requires_grad = False

TypeError: vgg19() takes no arguments (1 given)

I ran into the same problem. I cloned the latest code repos of pytorch and pytorchvision and built both manually.
Or you can refer to this issue

Hmm, I didn’t try the code yesterday and just changed the line from a residual network.

Now I tried it, and got the same error as you. It seems that you can get just the model, but not the weights of it. Which actually contradicts the documentation when it is written that you can give the pretrained argument.

As @Crazyai said, you can refer to that issue and download the model from github. However, bear in mind that they are not planning to release the version with batch normalization. Alternately, use a residual network and finetune it, probably getting even better results than with VGG.

1 Like

Pretrained VGG models are a new thing. Updating torchvision to the newest version should fix it.

1 Like

I uninstalled the current torchvision, and rebulid the newest version, and it worked. Thank you very much!

My network initialized as below:

#VGG19
import torchvision
import torch.nn as nn
import torch.nn.functional as F
vgg19 = torchvision.models.vgg19(pretrained=True)
for param in vgg19.parameters():
    param.requires_grad = False
requires_grad=True by default
vgg19.fc = nn.Linear(1000, 8) neurons, otherwise change it 
vgg19.cuda()
import torch.optim as optim
criterion = nn.CrossEntropyLoss() 
optimizer = optim.SGD(vgg19.fc.parameters(), lr=0.001, momentum=0.9)#lr 0.001

But when I train with code as below, there is a error in loss.backward():
code

        inputs, labels = Variable(inputs.cuda()), Variable(labels.cuda())
        # zero the parameter gradients
        optimizer.zero_grad()
        # forward + backward + optimize
        outputs = vgg19(inputs)
        #print(type(outputs),type(inputs))
        loss = criterion(outputs, labels)
        #loss = F.nll_loss(outputs, labels)
        loss.backward()        
        optimizer.step()
        step += 1
        # print statistics
        running_loss += loss.data[0]

error

RuntimeError: there are no graph nodes that require computing gradients

1 Like

That’s because vgg19 doesn’t have a fc member variable. Instead, it has a

  (classifier): Sequential (
    (0): Dropout (p = 0.5)
    (1): Linear (25088 -> 4096)
    (2): ReLU (inplace)
    (3): Dropout (p = 0.5)
    (4): Linear (4096 -> 4096)
    (5): ReLU (inplace)
    (6): Linear (4096 -> 100)
  )

To replace the last linear layer, a temporary solution would be

vgg19.classifier._modules['6'] = nn.Linear(4096, 8)
25 Likes

Thank you, then how should I change the last layer to param.requires_grad = True

Newly constructed layer has requires_grad=True by default. You don’t need to do it manually.

thank you. what should I change if I want to add dropout in ‘fc’? the last classifier layer has been changed in terms of the number of classes and I want to add dropout before it.

There’s already a dropout layer before the final FC layer, the code is

        self.classifier = nn.Sequential(
            nn.Linear(512 * 7 * 7, 4096),
            nn.ReLU(True),
            nn.Dropout(),
            nn.Linear(4096, 4096),
            nn.ReLU(True),
            nn.Dropout(),
            nn.Linear(4096, num_classes),
        )

you only need to replace the last 4096, num_classes to your own fc layer.

i am trying to convert the classification network to regression netowork by replacing the last layer with number of outputs as one can you please suggest any solution ay help would be greatly appreciated i am new to pytorch

You can just do exactly this by setting the last layer as:

nn.Linear(4096, 1)

In your training procedure you probably want to use nn.MSELoss as your criterion.

Hi ,
Thank you for your response. However, i tried that here is my code but the model returns nothing

class FineTuneModel(nn.Module):
    def __init__(self, original_model, num_classes):
        super(FineTuneModel, self).__init__()
        # Everything except the last linear layer
        self.features = nn.Sequential(*list(original_model.children())[:-1])
        self.classifier = nn.Sequential(
            nn.Linear(512, 1)
        )
        self.modelName = 'LightCNN-29'
        # Freeze those weights
        for p in self.features.parameters():
            p.requires_grad = False


    def forward(self, x):
        f = self.features(x)        
        f = f.view(f.size(0), -1)
        y = self.classifier(f)
        return y

model = FineTuneModel(original_model, args.num_classes)
print(model)

Output:

FineTuneModel(
  (features): Sequential()
  (classifier): Sequential(
    (0): Linear(in_features=512, out_features=1, bias=True)
  )
)

original model is pretrained resnet model

class FineTuneModel(nn.Module):
    def __init__(self, original_model, num_classes):
        super(FineTuneModel, self).__init__()
        # Everything except the last linear layer
        self.features = nn.Sequential(*list(original_model.children())[:-1])
        self.classifier = nn.Sequential(
            nn.Linear(512, 1)
        )
        self.modelName = 'LightCNN-29'
        # Freeze those weights
        for p in self.features.parameters():
            p.requires_grad = False


    def forward(self, x):
        f = self.features(x)        
        f = f.view(f.size(0), -1)
        y = self.classifier(f)
        return y
1 Like

self.features seems to be empty.
Could you check, that oroginal_model is a valid model?

PS: You can add code with three backticks `. I’ve formatted your code for better readability.

I can do this with ResNet easily but apparently VGG has no fc attribute to call.
If I build:

resnet_baseline = models.resnet50(pretrained=True)
vgg_baseline = models.vgg16(pretrained=True)

I can see the last fc layer of ResNet with resnet_baseline.fc.in_features.
But I just can see the last fc layer with list(vgg_baseline.children())[-1][-1]

I even need the second index because the way the modules are constructed is different from ResNet.

I have attempt to recreate the functionality by:

vgg_baseline.add_module(module=nn.Linear(list(vgg_baseline.children())[-1][-1].in_features, 75), name='fc')

And now I can call vgg_baseline.fc

Any ideas why VGG behaves like that?

Just wanna say more…
suppose use resnet18:

import torchvision.models as models
net = models.resnet18()

as you know the output has 1000 classes:

(fc): Linear(in_features=512, out_features=1000, bias=True)

if you want to change class to 10, someone may do:

net.fc.out_features = 10

I know it’s not working, but if you print the net, it gives you a changed output:

(fc): Linear(in_features=512, out_features=3, bias=True)

that is interesting. However you will get an error when training. The correct approach is:

net.fc = nn.Linear(512, 10)

so, if you want change the input channel from 3 to 1, use:

net.conv1 = nn.Conv2d(1, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)

rather than:

net.conv1.in_channels = 1

That’s it. It bothered me somehow, when I learned PyTorch. Just wanna point out~~~

4 Likes

Hi,
How to set gradients enabled for a particular class in the fc layer?
Ex:
Consider a resnet18 model with 10 classes.
Shape of the fc.weight layer is [10,512]
I want to train the only the weights corresponds to class0.
meaning only one vector of 512 elements.
How to do that?
Thanks

You cannot set the requires_grad attribute on slices of a parameter and would need to zero out the gradients of the frozen part of the parameter.
Alternatively you could also create two parameters (frozen and trainable), concatenate them and use the functional API for the layer operation.
However, the first approach might be a bit simpler.

1 Like