How can l load my best model as a feature extractor/evaluator?

I’m not sure, what conv_layer is doing, but it looks like a function taking a linear layer and some parameters.
I changed your code a bit, so that I can run it and it’s working as expected.
Also, I don’t know, what the third parameter in the constructor of nn.Linear shoule be.
The parameter is bias=True, so if you would like to use the bias, you should just pass True as the argument and not a number.

Here is my code snippet:

class My_net(nn.Module):
    def __init__(self):
        super(My_net, self).__init__()
       	# Conv layer 1
        self.cl1 = nn.Linear(10, 20)
        # Conv layer 1 
        self.cl2 = nn.Linear(20, 30)
        # FC1
        self.fc1 = nn.Linear(30, 40)
        # FC2
        self.fc2 = nn.Linear(40, 50)

    def forward(self, x):
        # conv layer 1
        x = self.cl1(x)
        x = F.relu(x)
        	# conv layer 2
        x = self.cl2(x)
        x = F.relu(x)
        	# Fully connected layer 1
        x = F.relu(x)
        # Fully connected layer 2
        x = self.fc2(x)
        x = F.relu(x)
        return x

model = My_net()
x = torch.randn(1, 10)
output = model(x)
> torch.Size([1, 50])

for fc2 it works.
but when l specify


l get always the fc2 activiation rather than fc1,cl2,cl1 respectively

Sorry, but I cannot reproduce the issue.
Here is my sample code and it prints all expected values:

model = My_net()
x = torch.randn(1, 10)
output = model(x)
> torch.Size([1, 20])
> torch.Size([1, 30])
> torch.Size([1, 40])
> torch.Size([1, 50])

Could you post a small executable code snippet with this issue? The last one was not completed, so that I had to remove some unknown stuff.


This works well if there are limited number of layers or ones with fixed names. How do you go about when the model contains many layers so that you can’t explicitly write these statements?

Essentially, my aim is to obtain the activations from all the layers inside a model.
Thanks! :slight_smile:

In that case, you could probably iterate all layers using:

for name, layer in model.named_modules():

x = torch.randn(1, 10)
output = model(x)
for key in activation:

Thank you.
you resolve my many isuues.

Lets assume we do model.fc1.register_forward_hook(get_activation(‘fc1’)) .
The definition of model in forward function says:
x1 = self.fc1(x)
x2 = F.relu(x1)
So when we do ''print(activation[‘fc1’]" , does this return us x1 or x2 ? (because both have same dimension)

I am interested in getting x2 (after RELU layer). How can I get this ? (all other conditions remain same as the original question)

This would return the output of the registered module, so you would get x1.
If you would like to get the output of the F.relu, you could create an nn.ReLU() module and register a forward hook to this particular module (note that you shouldn’t reuse this module, but just apply it where you need its output) or alternatively you could register a forward hook to the next module and store the input instead.


I found another way :
Write a same class with forward modified -> instantiate this new class -> load ‘state_dict’ from pretrained model -> do net.load_state_dict(pretrained_dict) -> set parameter.requires_grad = False -> set eval() mode -> get the intermediate output trun_model(inputs_train)

When I am applying this on my model it throw following error :

AttributeError: 'list' object has no attribute 'detach'

I assume you are trying to store the input?
If so, note that it will be passed as a tuple, so you might to index it before detaching:

activation[name] = input[0].detach()
1 Like

Hi ptrblck,
I have a similar problem I am trying to overcome and was hoping you could add some further explanation/ideas.

I am training a GAN architechture with the following VGG128 discriminator:

class Discriminator_VGG_128(nn.Module):
    def __init__(self, in_nc, nf):
        super(Discriminator_VGG_128, self).__init__()
        # [64, 128, 128]
        self.conv0_0 = nn.Conv2d(in_nc, nf, 3, 1, 1, bias=True)
        self.conv0_1 = nn.Conv2d(nf, nf, 4, 2, 1, bias=False)
        self.bn0_1 = nn.BatchNorm2d(nf, affine=True)
        # [64, 64, 64]
        self.conv1_0 = nn.Conv2d(nf, nf * 2, 3, 1, 1, bias=False)
        self.bn1_0 = nn.BatchNorm2d(nf * 2, affine=True)
        self.conv1_1 = nn.Conv2d(nf * 2, nf * 2, 4, 2, 1, bias=False)
        self.bn1_1 = nn.BatchNorm2d(nf * 2, affine=True)
        # [128, 32, 32]
        self.conv2_0 = nn.Conv2d(nf * 2, nf * 4, 3, 1, 1, bias=False)
        self.bn2_0 = nn.BatchNorm2d(nf * 4, affine=True)
        self.conv2_1 = nn.Conv2d(nf * 4, nf * 4, 4, 2, 1, bias=False)
        self.bn2_1 = nn.BatchNorm2d(nf * 4, affine=True)
        # [256, 16, 16]
        self.conv3_0 = nn.Conv2d(nf * 4, nf * 8, 3, 1, 1, bias=False)
        self.bn3_0 = nn.BatchNorm2d(nf * 8, affine=True)
        self.conv3_1 = nn.Conv2d(nf * 8, nf * 8, 4, 2, 1, bias=False)
        self.bn3_1 = nn.BatchNorm2d(nf * 8, affine=True)
        # [512, 8, 8]
        self.conv4_0 = nn.Conv2d(nf * 8, nf * 8, 3, 1, 1, bias=False)
        self.bn4_0 = nn.BatchNorm2d(nf * 8, affine=True)
        self.conv4_1 = nn.Conv2d(nf * 8, nf * 8, 4, 2, 1, bias=False)
        self.bn4_1 = nn.BatchNorm2d(nf * 8, affine=True)

        self.linear1 = nn.Linear(512 * 4 * 4, 100)
        self.linear2 = nn.Linear(100, 1)

        # activation function
        self.lrelu = nn.LeakyReLU(negative_slope=0.2, inplace=True)

   def forward(self, x):
        fea = self.lrelu(self.conv0_0(x))
        fea = self.lrelu(self.bn0_1(self.conv0_1(fea)))

        fea = self.lrelu(self.bn1_0(self.conv1_0(fea)))
        fea = self.lrelu(self.bn1_1(self.conv1_1(fea)))

        fea = self.lrelu(self.bn2_0(self.conv2_0(fea)))
        fea = self.lrelu(self.bn2_1(self.conv2_1(fea)))

        fea = self.lrelu(self.bn3_0(self.conv3_0(fea)))
        fea = self.lrelu(self.bn3_1(self.conv3_1(fea)))

        fea = self.lrelu(self.bn4_0(self.conv4_0(fea)))
        fea = self.lrelu(self.bn4_1(self.conv4_1(fea)))

        fea = fea.view(fea.size(0), -1)
        fea = self.lrelu(self.linear1(fea))
        out = self.linear2(fea)
        return out

I am wanting to take the features from each layer of the network for both real data input and fake data input to inform a loss function as described in this paper and and github:

Perceptual Adversarial Networks

PAN - github loss function

I am essentially trying to translate the following Theano Lasagne code to Pytorch for my network but struggling with removing the ouputs from intermediate layers:

# Create expression for passing real data through the discriminator
        dis1_f, dis2_f, dis3_f, dis4_f, disout_f = lasagne.layers.get_output(discriminator,input_y)
        # Create expression for passing fake data through the discriminator
        dis1_ff, dis2_ff, dis3_ff, dis4_ff, disout_ff = lasagne.layers.get_output(discriminator,gen_imgs)

Any help here would be much appreciated.

If you want to get all intermediate outputs, you could assign unique variables to them and return them all:

   def forward(self, x):
        fea0 = self.lrelu(self.conv0_0(x))
        fea1 = self.lrelu(self.bn0_1(self.conv0_1(fea0)))

        fea2 = self.lrelu(self.bn1_0(self.conv1_0(fea1)))

        return out, fea0, fea1, fea2, ...

Would that work for you?

you saved my day dude tnx!

Would you help me to understand how can I save all the activation of conv2d in resnet101 BasicBlock?
The main goal is : I have a faster rcnn with res101 pretrained model as Head, I trained it with KITTI and now I want to see the activation of all the conv2d of res101 in BasicBlock and compare them when the input is augmented images.

I think the easiest way would be to register forward hooks as shown here.
Using some condition on all children, you should be able to just register the submodules you need.

activation = {}
def get_activation(name):
    def hook(model, input, output):
        activation[name] = output.detach()
    return hook

is there a more efficient way to use this in the forward pass without initiating activations in init as self.activations = {}? My current implementation is like this:

class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        self.activation = {}
        self.encoder = Encoder
    def forward(self, x):
         x = self.encoder(x)
         return self.activation[name]

    def get_activation(name):
        def hook(model, input, output):
            activation[name] = output.detach()
        return hook

Here in each forward pass, the model edits better way of using the state of the class which is not a good practice. Yet, I also cannot modify get_activation to return the activations dictionary because of the hook function constraint. I wonder if there is any better way of using get_activations in the forward pass without having to outsource the activations dict to init.

1 Like

Great, thanks a lot.

Since someone else may also prefer not using hooks I will leave this code snippet here I found quite useful to extract from torchvision models (adapted from here):

import torch
import torchvision
def feature_forward(self, x):
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.maxpool(x)
        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)
        x = self.avgpool(x)
        feature_vector = torch.flatten(x, 1)
        x = self.fc(feature_vector)
        return x, feature_vector

torchvision.models.ResNet.forward.__code__ = feature_forward.__code__
model = torchvision.models.resnet34()
classes, features = model(torch.randn(1,3,224,224))

Hi @ptrblck. I’m wondering can I use forward_hooks to get the logits(not probabilities) of a trained model or not. Is it possible to feed the test set to our trained model to get the logits with hooks?