How can l load my best model as a feature extractor/evaluator?

Hi back @ptrblck,

Thank you for you help/ But l get stuck once again and l’m confused. Here is my network architecture (variable net)

net
ConvNet_LeNet5(
(cl1): Linear(in_features=25, out_features=32, bias=True)
(cl2): Linear(in_features=25, out_features=64, bias=True)
(fc1): Linear(in_features=51200, out_features=100, bias=True)
(fc2): Linear(in_features=100, out_features=4, bias=True)
)

my best network is saved as follow :

        save_checkpoint({
            'epoch': int(epoch) + 1,
            'state_dict': net.state_dict(),
            'best_prec1': best_prec1,
            'optimizer': optimizer.state_dict(),
        }, is_best)

Before loading the best model let’s try forward hook as you suggested

net.fc2.register_forward_hook(get_activation('fc2'))

works well.

But we need to load the best model as a feature extractor. I did that as follow :

my_best_model=net.load_state_dict(torch.load('model_best.pth.tar'))
l get the following error :
*** KeyError: 'unexpected key "epoch" in state_dict'
Then l loaded the best model without specifying load_state_dict as follow :

my_best_model = torch.load('model_best.pth.tar')
my_best_model.keys()
dict_keys(['epoch', 'state_dict', 'best_prec1', 'optimizer'])
my_best_model=my_model['state_dict']
my_best_model.keys()
odict_keys(['cl1.weight', 'cl1.bias', 'cl2.weight', 'cl2.bias', 'fc1.weight', 'fc1.bias', 'fc2.weight', 'fc2.bias'])
my_best_model=my_model['fc2.weight']

I’m not sure if it’s the correct way to load fc2 as a feature extractor (since l didn’t succeed to do that with hook). Please correct me.

x is a test example

output=my_best_model(x)
It returns *** TypeError: ‘torch.cuda.FloatTensor’ object is not callable

All what is needed is to fix
my_best_model=net.load_state_dict(torch.load(‘model_best.pth.tar’)) # error *** KeyError: ‘unexpected key “epoch” in state_dict’
and
my_best_model.fc2.register_forward_hook(get_activation('fc2'))

Thank you for your help @ptrblck

Since you saved your echeckpoint as a dict, you will also load it as such.
Therefore to get your state_dict you have to call checkpoint['state_dict'] on it.

Also, if you would like to use the fc2 as a feature extractor, you would have to restore your complete model and calculate the complete forward pass with your sample.

Why did the hook approach not work?

@ptrblck l have an l update,

l hope that l fixed loading state dic correctly. I did that as follow :

pretrained_dict=torch.load('model_best.pth.tar') 
pretrained_dict=pretrained_dict['state_dict']

then

net.load_state_dict(pretrained_dict)
net.fc2.register_forward_hook(get_activation('fc2'))

Until here it works fine without any error.
Can you confirm to me that with this process l load the pretrained weigh (best model) of net and not the initial random weight of net (used to initialize the network ?)

Then l did the following :
output = net(test_x[0], coord_test[0], adj_test[0], L_test[0],lmax_test[0],mask_test[0])

and l got the following error :
***** IndexError: invalid index to scalar variable.**

Thank you @ptrblck for you help

The loading looks fine!

It looks like you use a LeNet style architecture and try to call the forward pass on multiple inputs.
Could you print the shape of test_x?
Also, what are the other tensors? I assume test_y is the target.

@ptrblck, problem solved. batch_size=4 , so l need to forward 4 exemples.

I have a question for you if you don’t mind :

now l want to extract features from layer fc1. However, when l apply :

net.fc1.register_forward_hook(get_activation('fc1'))

l get always the output of fc2.

net
ConvNet_LeNet5(
(cl1): Linear(in_features=25, out_features=32, bias=True)
(cl2): Linear(in_features=25, out_features=64, bias=True)
(fc1): Linear(in_features=51200, out_features=100, bias=True)
(fc2): Linear(in_features=100, out_features=40, bias=True)
)

when l apply

net.fc1.register_forward_hook(get_activation('fc1'))
l’m supposed to vector of 100 features however l get 40.

Does you hook function

activation = {}
def get_activation(name):
    def hook(model, input, output):
        activation[name] = output.detach()
    return hook

works only for the last layer fc2 ?

Thank you

I’m not sure, why you don’t get the right activations.
The model definition looks a bit strange. If the model is designed to run as a sequential model, the shapes of the linear layers don’t match together.

In my code example the hooks run fine. Could you check, if you used the same code logic?

Hi @ptrblck,

It’s not a problem for the shpae . l use x.view() to reshape. I’m working with graph ConvNet.

Here is my init() and forward() functions

class My_net(nn.Module):

    def __init__(self, net_parameters):

        print('My_net')

        super(My_net, self).__init__()
        np.random.seed(seed)

        D, CL1_F, CL1_K, FC1_F, FC2_F, p = net_parameters

       	# Conv layer 1 
        
        self.cl1 = nn.Linear(CL1_K, CL1_F,10)
        Fin = CL1_K;
        Fout = CL1_F;
        scale = np.sqrt(2.0 / (Fin + Fout))
        self.cl1.weight.data.uniform_(-scale, scale)
        self.cl1.bias.data.fill_(0.0)
        self.CL1_K = CL1_K;
        self.CL1_F = CL1_F;


        # Conv layer 1 
        self.cl2 = nn.Linear(CL2_K, CL2_F,10)
        Fin = CL2_K * CL1_F;
        Fout = CL2_F;
        scale = np.sqrt(2.0 / (Fin + Fout))
        self.cl2.weight.data.uniform_(-scale, scale)
        self.cl2.bias.data.fill_(0.0)
        self.CL2_K = CL2_K;
        self.CL2_F = CL2_F;



        # FC1
        self.fc1 = nn.Linear(FC1Fin, FC1_F,10)
        Fin = FC1Fin;
        Fout = FC1_F;
        scale = np.sqrt(2.0 / (Fin + Fout))
        self.fc1.weight.data.uniform_(-scale, scale)
        self.fc1.bias.data.fill_(0.0)
        self.FC1Fin = FC1Fin

        # FC2
        self.fc2 = nn.Linear(FC1_F, FC2_F,10)
        Fin = FC1_F;
        Fout = FC2_F;
        scale = np.sqrt(2.0 / (Fin + Fout))
        self.fc2.weight.data.uniform_(-scale, scale)
        self.fc2.bias.data.fill_(0.0)




    def forward(self, x, d, L, lmax, coord, adj_matrix,n_nodes):



        # conv layer 1
        x = conv_layer(x, self.cl1, L, lmax, self.CL1_F, self.CL1_K,n_nodes,masked=True)
        x = F.relu(x)


	# conv layer 2
        x = conv_layer(x, self.cl1, L, lmax, self.CL1_F, self.CL1_K,n_nodes,masked=True)
        x = F.relu(x)

	# Fully connected layer 1

        x=self.fc1(x)
        x = F.relu(x)




        # Fully connected layer 2

        x = self.fc2(x)
        x = F.relu(x)

        return x

I’m not sure, what conv_layer is doing, but it looks like a function taking a linear layer and some parameters.
I changed your code a bit, so that I can run it and it’s working as expected.
Also, I don’t know, what the third parameter in the constructor of nn.Linear shoule be.
The parameter is bias=True, so if you would like to use the bias, you should just pass True as the argument and not a number.

Here is my code snippet:

class My_net(nn.Module):
    def __init__(self):
        super(My_net, self).__init__()
       	# Conv layer 1
        self.cl1 = nn.Linear(10, 20)
        # Conv layer 1 
        self.cl2 = nn.Linear(20, 30)
        # FC1
        self.fc1 = nn.Linear(30, 40)
        # FC2
        self.fc2 = nn.Linear(40, 50)

    def forward(self, x):
        # conv layer 1
        x = self.cl1(x)
        x = F.relu(x)
        	# conv layer 2
        x = self.cl2(x)
        x = F.relu(x)
        	# Fully connected layer 1
        x=self.fc1(x)
        x = F.relu(x)
        # Fully connected layer 2
        x = self.fc2(x)
        x = F.relu(x)
        return x

model = My_net()
model.fc2.register_forward_hook(get_activation('fc2'))
x = torch.randn(1, 10)
output = model(x)
print(activation['fc2'].shape)
> torch.Size([1, 50])

for fc2 it works.
but when l specify

model.fc1.register_forward_hook(get_activation('fc1'))
model.cl2.register_forward_hook(get_activation('cl2'))
model.cl1.register_forward_hook(get_activation('cl1'))

l get always the fc2 activiation rather than fc1,cl2,cl1 respectively

Sorry, but I cannot reproduce the issue.
Here is my sample code and it prints all expected values:

model = My_net()
model.cl1.register_forward_hook(get_activation('cl1'))
model.cl2.register_forward_hook(get_activation('cl2'))
model.fc1.register_forward_hook(get_activation('fc1'))
model.fc2.register_forward_hook(get_activation('fc2'))
x = torch.randn(1, 10)
output = model(x)
print(activation['cl1'].shape)
print(activation['cl2'].shape)
print(activation['fc1'].shape)
print(activation['fc2'].shape)
> torch.Size([1, 20])
> torch.Size([1, 30])
> torch.Size([1, 40])
> torch.Size([1, 50])

Could you post a small executable code snippet with this issue? The last one was not completed, so that I had to remove some unknown stuff.

3 Likes

This works well if there are limited number of layers or ones with fixed names. How do you go about when the model contains many layers so that you can’t explicitly write these statements?

Essentially, my aim is to obtain the activations from all the layers inside a model.
Thanks! :slight_smile:

In that case, you could probably iterate all layers using:

for name, layer in model.named_modules():
    layer.register_forward_hook(get_activation(name))

x = torch.randn(1, 10)
output = model(x)
for key in activation:
    print(activation[key])
7 Likes

Thank you.
you resolve my many isuues.

Lets assume we do model.fc1.register_forward_hook(get_activation(‘fc1’)) .
The definition of model in forward function says:
x1 = self.fc1(x)
x2 = F.relu(x1)
So when we do ''print(activation[‘fc1’]" , does this return us x1 or x2 ? (because both have same dimension)

I am interested in getting x2 (after RELU layer). How can I get this ? (all other conditions remain same as the original question)

This would return the output of the registered module, so you would get x1.
If you would like to get the output of the F.relu, you could create an nn.ReLU() module and register a forward hook to this particular module (note that you shouldn’t reuse this module, but just apply it where you need its output) or alternatively you could register a forward hook to the next module and store the input instead.

2 Likes

I found another way :
Write a same class with forward modified -> instantiate this new class -> load ‘state_dict’ from pretrained model -> do net.load_state_dict(pretrained_dict) -> set parameter.requires_grad = False -> set eval() mode -> get the intermediate output trun_model(inputs_train)

When I am applying this on my model it throw following error :

AttributeError: 'list' object has no attribute 'detach'

I assume you are trying to store the input?
If so, note that it will be passed as a tuple, so you might to index it before detaching:

activation[name] = input[0].detach()
1 Like

Hi ptrblck,
I have a similar problem I am trying to overcome and was hoping you could add some further explanation/ideas.

I am training a GAN architechture with the following VGG128 discriminator:

class Discriminator_VGG_128(nn.Module):
    def __init__(self, in_nc, nf):
        super(Discriminator_VGG_128, self).__init__()
        # [64, 128, 128]
        self.conv0_0 = nn.Conv2d(in_nc, nf, 3, 1, 1, bias=True)
        self.conv0_1 = nn.Conv2d(nf, nf, 4, 2, 1, bias=False)
        self.bn0_1 = nn.BatchNorm2d(nf, affine=True)
        # [64, 64, 64]
        self.conv1_0 = nn.Conv2d(nf, nf * 2, 3, 1, 1, bias=False)
        self.bn1_0 = nn.BatchNorm2d(nf * 2, affine=True)
        self.conv1_1 = nn.Conv2d(nf * 2, nf * 2, 4, 2, 1, bias=False)
        self.bn1_1 = nn.BatchNorm2d(nf * 2, affine=True)
        # [128, 32, 32]
        self.conv2_0 = nn.Conv2d(nf * 2, nf * 4, 3, 1, 1, bias=False)
        self.bn2_0 = nn.BatchNorm2d(nf * 4, affine=True)
        self.conv2_1 = nn.Conv2d(nf * 4, nf * 4, 4, 2, 1, bias=False)
        self.bn2_1 = nn.BatchNorm2d(nf * 4, affine=True)
        # [256, 16, 16]
        self.conv3_0 = nn.Conv2d(nf * 4, nf * 8, 3, 1, 1, bias=False)
        self.bn3_0 = nn.BatchNorm2d(nf * 8, affine=True)
        self.conv3_1 = nn.Conv2d(nf * 8, nf * 8, 4, 2, 1, bias=False)
        self.bn3_1 = nn.BatchNorm2d(nf * 8, affine=True)
        # [512, 8, 8]
        self.conv4_0 = nn.Conv2d(nf * 8, nf * 8, 3, 1, 1, bias=False)
        self.bn4_0 = nn.BatchNorm2d(nf * 8, affine=True)
        self.conv4_1 = nn.Conv2d(nf * 8, nf * 8, 4, 2, 1, bias=False)
        self.bn4_1 = nn.BatchNorm2d(nf * 8, affine=True)

        self.linear1 = nn.Linear(512 * 4 * 4, 100)
        self.linear2 = nn.Linear(100, 1)

        # activation function
        self.lrelu = nn.LeakyReLU(negative_slope=0.2, inplace=True)


   def forward(self, x):
        fea = self.lrelu(self.conv0_0(x))
        fea = self.lrelu(self.bn0_1(self.conv0_1(fea)))

        fea = self.lrelu(self.bn1_0(self.conv1_0(fea)))
        fea = self.lrelu(self.bn1_1(self.conv1_1(fea)))

        fea = self.lrelu(self.bn2_0(self.conv2_0(fea)))
        fea = self.lrelu(self.bn2_1(self.conv2_1(fea)))

        fea = self.lrelu(self.bn3_0(self.conv3_0(fea)))
        fea = self.lrelu(self.bn3_1(self.conv3_1(fea)))

        fea = self.lrelu(self.bn4_0(self.conv4_0(fea)))
        fea = self.lrelu(self.bn4_1(self.conv4_1(fea)))

        fea = fea.view(fea.size(0), -1)
        fea = self.lrelu(self.linear1(fea))
        out = self.linear2(fea)
 
        return out

I am wanting to take the features from each layer of the network for both real data input and fake data input to inform a loss function as described in this paper and and github:

Perceptual Adversarial Networks

PAN - github loss function

I am essentially trying to translate the following Theano Lasagne code to Pytorch for my network but struggling with removing the ouputs from intermediate layers:

# Create expression for passing real data through the discriminator
        dis1_f, dis2_f, dis3_f, dis4_f, disout_f = lasagne.layers.get_output(discriminator,input_y)
        # Create expression for passing fake data through the discriminator
        dis1_ff, dis2_ff, dis3_ff, dis4_ff, disout_ff = lasagne.layers.get_output(discriminator,gen_imgs)

Any help here would be much appreciated.

If you want to get all intermediate outputs, you could assign unique variables to them and return them all:

...
   def forward(self, x):
        fea0 = self.lrelu(self.conv0_0(x))
        fea1 = self.lrelu(self.bn0_1(self.conv0_1(fea0)))

        fea2 = self.lrelu(self.bn1_0(self.conv1_0(fea1)))

        return out, fea0, fea1, fea2, ...

Would that work for you?