How can l load my best model as a feature extractor/evaluator?


l have stored my best model where the network is as follow


  (cl1): Linear(in_features=25, out_features=6, bias=True)
  (cl2): Linear(in_features=60, out_features=16, bias=True)
  (fc1): Linear(in_features=16, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=84, bias=True)
  (fc3): Linear(in_features=84, out_features=10, bias=True)

To load my best model, l did the following :

my_best_model = torch.load('path to best model')
dict_keys(['epoch', 'state_dict', 'best_prec1', 'optimizer'])

What l would like to do ?

How can load and use my_best_model as :

  1. Feature extractor ?
  2. Evaluator on new examples ?

Thank you


You would have to load the state_dict and set it to eval():

model = MyModel(...)

Now you can use it to evaluate new samples:

new_sample = ...
output = model(new_sample)

What do you mean by feature extractor?
Would you like to get a certain intermediate layer output from your model?


Thank you for your reply. By feature extractor, l mean get ouptut from the last fully connected layer before softmax

You could return this activation together with the output in your forward method a use a forward_hook on the layer.

What if l would like to do that for an intermediate layer lets say the forth layer ?

I’ve created a small code snippet using a forward hook to store one activation from fc2:

class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        self.cl1 = nn.Linear(25, 60)
        self.cl2 = nn.Linear(60, 16)
        self.fc1 = nn.Linear(16, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)
    def forward(self, x):
        x = F.relu(self.cl1(x))
        x = F.relu(self.cl2(x))
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = F.log_softmax(self.fc3(x), dim=1)
        return x

activation = {}
def get_activation(name):
    def hook(model, input, output):
        activation[name] = output.detach()
    return hook

model = MyModel()
x = torch.randn(1, 25)
output = model(x)
Get intermediate value from a block in nn.Sequential in a model
Undestanding register_*_hook() functions
Getting Intermediate Output of Self Created Sequential
Extract activation maps
Pre trained VGG 19 maxpool layer output
Extracting features from specific layers on a trained network
Get layer's output from nn.Sequential
How to access intermediate layer in a pretrained model and sequence layers?
Retrieve intermediate node values without explicitly return them
cycleGAN inference pictures differ from images generated in training
How to get the output at the 2nd block of a resnet-50?
How to use parameters from autoencoder to CNN for classification
How to check or view the intermediate results or output of a network?
Using feature extraction layers from pre-trained FRCNN
How to get output of layers?
Why is the input of hook function a tuple?
How to add activation histogram in tensorboard?
ResNet18 - access to the output of each BasicBlock
How to get image featrue using my pre-trained model?
Extract specific outputs through pre trained classifier
Lasagne.layers.get_output equivalent method in pytorch
How to select pixels of ROI from feature map
How to get feature map from pre-trained Resnet50
Intermediate tensors in NeMo
Network produces different output for the same image from different Dataloaders
Testing model's performance after saving and without saving the model
To get a middle activation map
Activation values in ResNet
Heatmap localization
Make values consistent across versions
Logging and changing activation values outside the forward method
Loss nan when resuming from a pretrained model
FP16 gives NaN loss when using pre-trained model
Extract feature vector/latent factors from Embedding layer in Pytorch
Get 2048 Length Feature Vectors of Inception Model TorchVision 0.7.0
How is feature extraction done from by using pre-activations last CNN layer in VGG-19
How do I print output of each layer in sequential?
Converting keras model to pytorch
Visualize feature map
How to save inner nn.Sequential layer's output
Online Knowledge Distillation
Extracting features from boxes in the pretrained Faster Rcnn
About Normalization using pre-trained vgg16 networks
Nan Loss coming after some time
Basic MNIST Keras model to Pytorch implementation
Get output of intermediate layer of pre-trained model
How can I get the Hidden Embeddings from the 2nd last fully connected layer for t-SNE visualization?
My transformer NMT model is giving "nan" loss value
How can I gets output of one layer?
Torchvision Mask-rcnn with Resnext101 backbone occur Nan loss during the training
How to modify and rewrite the activation output of a layer before applying the output to next layer?
Return item in forward hook
How to get the intermediate features from Alexnet
Visualise sequential model feature maps
Nan Loss with torch.cuda.amp and CrossEntropyLoss
Representation projection to 2D plot
How to freeze the vector at the second last layer of shallow model?
Intermediate output from sequential layers and to use the output for further processing while training
How to read an intermediate layer of a pretrained network
Training with Half Precision
Extract intermediate representation of MiDaS neural network?
Selecting a particular layer from the pytorch pre-trained model
Type mismatch with hooks input when using JIT
How to obtain the tensor from the middle part of a CNN model (EfficientNet_b0), instead of the last layer?
Best way to store feature vectors from pretrained model to hard disk?
Want to output intermediate layers from pretrained Resnet 18
Extract the 2048 vector of a fine-tuned Inception V3 on test set
How can i ignore thoes layers?
Tabular Data (DAE + MLP model): nan values while training
How to take the features of net1 as the input of net2
How can I extract intermediate layer output from loaded CNN model?
Pipe Pretrained Model to Custom Layers
Feature Vector Extraction from Densenet121
Equivalent of register forward hook for parameters?
Create feature map of pretrained resnet50 layers
Need feature maps of ResNet50
Memory usage increases by at least 30 when applying model
Is there the train model store the output in pytorch?
Same activation at different layers
How to access input/output activations of a layer given its parameters names?
Retrieve neural network name inside registered hook
Is it necessary to overwrite the forward method for a class that inherits from nn.Module?
Saving extracted features
Extract c3d features, given video
Pathological loss values when model reloaded
Feature pyramid
How can I print the image of each layer in CNN?
How to use hook to visualise feature maps
Intermediate Layers of AlexNet/VGG
How to delete layer in pretrained model?

Thanks a lot. Awesome

1 Like

In your small code snippet, we need to make x in Variable before feeding it to the model.

output = model(x)

And print(activation['fc2'])
*** KeyError: ‘fc2’

If you are using PyTorch < 0.4.0, you have to wrap it into a Variable.
The most recent stable version is 0.4.0 where Variables and tensors were merged.
Have a look at the Migration Guide.
You’ll find the install instructions on the website.

The KeyError is strange. Have you registered the activation with get_activation('fc2')?
The name argument is used to store the activation in the dict, so I’m wondering, why the error happened.

l did


Which PyTorch and Python version are you using?


and python3.6

I just installed an env with PyTorch 0.3.1.post3 and Python 3.6.5.
Since in this PyTorch version, Variables and tensors weren’t merged, you have to use:

x = Variable(torch.randn(1, 25))

Your code should therefore throw an error when you try to run output = model(x).
Besides that, the code runs fine on my machine.

1 Like

Hi back @ptrblck,

Thank you for you help/ But l get stuck once again and l’m confused. Here is my network architecture (variable net)

(cl1): Linear(in_features=25, out_features=32, bias=True)
(cl2): Linear(in_features=25, out_features=64, bias=True)
(fc1): Linear(in_features=51200, out_features=100, bias=True)
(fc2): Linear(in_features=100, out_features=4, bias=True)

my best network is saved as follow :

            'epoch': int(epoch) + 1,
            'state_dict': net.state_dict(),
            'best_prec1': best_prec1,
            'optimizer': optimizer.state_dict(),
        }, is_best)

Before loading the best model let’s try forward hook as you suggested


works well.

But we need to load the best model as a feature extractor. I did that as follow :

l get the following error :
*** KeyError: 'unexpected key "epoch" in state_dict'
Then l loaded the best model without specifying load_state_dict as follow :

my_best_model = torch.load('model_best.pth.tar')
dict_keys(['epoch', 'state_dict', 'best_prec1', 'optimizer'])
odict_keys(['cl1.weight', 'cl1.bias', 'cl2.weight', 'cl2.bias', 'fc1.weight', 'fc1.bias', 'fc2.weight', 'fc2.bias'])

I’m not sure if it’s the correct way to load fc2 as a feature extractor (since l didn’t succeed to do that with hook). Please correct me.

x is a test example

It returns *** TypeError: ‘torch.cuda.FloatTensor’ object is not callable

All what is needed is to fix
my_best_model=net.load_state_dict(torch.load(‘model_best.pth.tar’)) # error *** KeyError: ‘unexpected key “epoch” in state_dict’

Thank you for your help @ptrblck

Since you saved your echeckpoint as a dict, you will also load it as such.
Therefore to get your state_dict you have to call checkpoint['state_dict'] on it.

Also, if you would like to use the fc2 as a feature extractor, you would have to restore your complete model and calculate the complete forward pass with your sample.

Why did the hook approach not work?

@ptrblck l have an l update,

l hope that l fixed loading state dic correctly. I did that as follow :




Until here it works fine without any error.
Can you confirm to me that with this process l load the pretrained weigh (best model) of net and not the initial random weight of net (used to initialize the network ?)

Then l did the following :
output = net(test_x[0], coord_test[0], adj_test[0], L_test[0],lmax_test[0],mask_test[0])

and l got the following error :
***** IndexError: invalid index to scalar variable.**

Thank you @ptrblck for you help

The loading looks fine!

It looks like you use a LeNet style architecture and try to call the forward pass on multiple inputs.
Could you print the shape of test_x?
Also, what are the other tensors? I assume test_y is the target.

@ptrblck, problem solved. batch_size=4 , so l need to forward 4 exemples.

I have a question for you if you don’t mind :

now l want to extract features from layer fc1. However, when l apply :


l get always the output of fc2.

(cl1): Linear(in_features=25, out_features=32, bias=True)
(cl2): Linear(in_features=25, out_features=64, bias=True)
(fc1): Linear(in_features=51200, out_features=100, bias=True)
(fc2): Linear(in_features=100, out_features=40, bias=True)

when l apply

l’m supposed to vector of 100 features however l get 40.

Does you hook function

activation = {}
def get_activation(name):
    def hook(model, input, output):
        activation[name] = output.detach()
    return hook

works only for the last layer fc2 ?

Thank you

I’m not sure, why you don’t get the right activations.
The model definition looks a bit strange. If the model is designed to run as a sequential model, the shapes of the linear layers don’t match together.

In my code example the hooks run fine. Could you check, if you used the same code logic?

Hi @ptrblck,

It’s not a problem for the shpae . l use x.view() to reshape. I’m working with graph ConvNet.

Here is my init() and forward() functions

class My_net(nn.Module):

    def __init__(self, net_parameters):


        super(My_net, self).__init__()

        D, CL1_F, CL1_K, FC1_F, FC2_F, p = net_parameters

       	# Conv layer 1 
        self.cl1 = nn.Linear(CL1_K, CL1_F,10)
        Fin = CL1_K;
        Fout = CL1_F;
        scale = np.sqrt(2.0 / (Fin + Fout)), scale)
        self.CL1_K = CL1_K;
        self.CL1_F = CL1_F;

        # Conv layer 1 
        self.cl2 = nn.Linear(CL2_K, CL2_F,10)
        Fin = CL2_K * CL1_F;
        Fout = CL2_F;
        scale = np.sqrt(2.0 / (Fin + Fout)), scale)
        self.CL2_K = CL2_K;
        self.CL2_F = CL2_F;

        # FC1
        self.fc1 = nn.Linear(FC1Fin, FC1_F,10)
        Fin = FC1Fin;
        Fout = FC1_F;
        scale = np.sqrt(2.0 / (Fin + Fout)), scale)
        self.FC1Fin = FC1Fin

        # FC2
        self.fc2 = nn.Linear(FC1_F, FC2_F,10)
        Fin = FC1_F;
        Fout = FC2_F;
        scale = np.sqrt(2.0 / (Fin + Fout)), scale)

    def forward(self, x, d, L, lmax, coord, adj_matrix,n_nodes):

        # conv layer 1
        x = conv_layer(x, self.cl1, L, lmax, self.CL1_F, self.CL1_K,n_nodes,masked=True)
        x = F.relu(x)

	# conv layer 2
        x = conv_layer(x, self.cl1, L, lmax, self.CL1_F, self.CL1_K,n_nodes,masked=True)
        x = F.relu(x)

	# Fully connected layer 1

        x = F.relu(x)

        # Fully connected layer 2

        x = self.fc2(x)
        x = F.relu(x)

        return x