How can l load my best model as a feature extractor/evaluator?

Hello,

l have stored my best model where the network is as follow

net

My_Net(
  (cl1): Linear(in_features=25, out_features=6, bias=True)
  (cl2): Linear(in_features=60, out_features=16, bias=True)
  (fc1): Linear(in_features=16, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=84, bias=True)
  (fc3): Linear(in_features=84, out_features=10, bias=True)
)

To load my best model, l did the following :

my_best_model = torch.load('path to best model')
my_model.keys()
dict_keys(['epoch', 'state_dict', 'best_prec1', 'optimizer'])

What l would like to do ?

How can load and use my_best_model as :

  1. Feature extractor ?
  2. Evaluator on new examples ?

Thank you

3 Likes

You would have to load the state_dict and set it to eval():

model = MyModel(...)
model.load_state_dict(my_model['state_dict'])
model.eval()

Now you can use it to evaluate new samples:

new_sample = ...
output = model(new_sample)

What do you mean by feature extractor?
Would you like to get a certain intermediate layer output from your model?

2 Likes

Thank you for your reply. By feature extractor, l mean get ouptut from the last fully connected layer before softmax

You could return this activation together with the output in your forward method a use a forward_hook on the layer.

What if l would like to do that for an intermediate layer lets say the forth layer ?

I’ve created a small code snippet using a forward hook to store one activation from fc2:

class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        self.cl1 = nn.Linear(25, 60)
        self.cl2 = nn.Linear(60, 16)
        self.fc1 = nn.Linear(16, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)
        
    def forward(self, x):
        x = F.relu(self.cl1(x))
        x = F.relu(self.cl2(x))
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = F.log_softmax(self.fc3(x), dim=1)
        return x


activation = {}
def get_activation(name):
    def hook(model, input, output):
        activation[name] = output.detach()
    return hook


model = MyModel()
model.fc2.register_forward_hook(get_activation('fc2'))
x = torch.randn(1, 25)
output = model(x)
print(activation['fc2'])
87 Likes

Thanks a lot. Awesome

1 Like

@ptrblck,
In your small code snippet, we need to make x in Variable before feeding it to the model.

x=Variable(x) 
output = model(x)

And print(activation['fc2'])
returns
*** KeyError: ‘fc2’

If you are using PyTorch < 0.4.0, you have to wrap it into a Variable.
The most recent stable version is 0.4.0 where Variables and tensors were merged.
Have a look at the Migration Guide.
You’ll find the install instructions on the website.

The KeyError is strange. Have you registered the activation with get_activation('fc2')?
The name argument is used to store the activation in the dict, so I’m wondering, why the error happened.

Yes,
l did

model.fc2.register_forward_hook(get_activation(‘fc2’))

Which PyTorch and Python version are you using?

print(torch.version)
0.3.1.post3

and python3.6

I just installed an env with PyTorch 0.3.1.post3 and Python 3.6.5.
Since in this PyTorch version, Variables and tensors weren’t merged, you have to use:

x = Variable(torch.randn(1, 25))

Your code should therefore throw an error when you try to run output = model(x).
Besides that, the code runs fine on my machine.

1 Like

Hi back @ptrblck,

Thank you for you help/ But l get stuck once again and l’m confused. Here is my network architecture (variable net)

net
ConvNet_LeNet5(
(cl1): Linear(in_features=25, out_features=32, bias=True)
(cl2): Linear(in_features=25, out_features=64, bias=True)
(fc1): Linear(in_features=51200, out_features=100, bias=True)
(fc2): Linear(in_features=100, out_features=4, bias=True)
)

my best network is saved as follow :

        save_checkpoint({
            'epoch': int(epoch) + 1,
            'state_dict': net.state_dict(),
            'best_prec1': best_prec1,
            'optimizer': optimizer.state_dict(),
        }, is_best)

Before loading the best model let’s try forward hook as you suggested

net.fc2.register_forward_hook(get_activation('fc2'))

works well.

But we need to load the best model as a feature extractor. I did that as follow :

my_best_model=net.load_state_dict(torch.load('model_best.pth.tar'))
l get the following error :
*** KeyError: 'unexpected key "epoch" in state_dict'
Then l loaded the best model without specifying load_state_dict as follow :

my_best_model = torch.load('model_best.pth.tar')
my_best_model.keys()
dict_keys(['epoch', 'state_dict', 'best_prec1', 'optimizer'])
my_best_model=my_model['state_dict']
my_best_model.keys()
odict_keys(['cl1.weight', 'cl1.bias', 'cl2.weight', 'cl2.bias', 'fc1.weight', 'fc1.bias', 'fc2.weight', 'fc2.bias'])
my_best_model=my_model['fc2.weight']

I’m not sure if it’s the correct way to load fc2 as a feature extractor (since l didn’t succeed to do that with hook). Please correct me.

x is a test example

output=my_best_model(x)
It returns *** TypeError: ‘torch.cuda.FloatTensor’ object is not callable

All what is needed is to fix
my_best_model=net.load_state_dict(torch.load(‘model_best.pth.tar’)) # error *** KeyError: ‘unexpected key “epoch” in state_dict’
and
my_best_model.fc2.register_forward_hook(get_activation('fc2'))

Thank you for your help @ptrblck

Since you saved your echeckpoint as a dict, you will also load it as such.
Therefore to get your state_dict you have to call checkpoint['state_dict'] on it.

Also, if you would like to use the fc2 as a feature extractor, you would have to restore your complete model and calculate the complete forward pass with your sample.

Why did the hook approach not work?

@ptrblck l have an l update,

l hope that l fixed loading state dic correctly. I did that as follow :

pretrained_dict=torch.load('model_best.pth.tar') 
pretrained_dict=pretrained_dict['state_dict']

then

net.load_state_dict(pretrained_dict)
net.fc2.register_forward_hook(get_activation('fc2'))

Until here it works fine without any error.
Can you confirm to me that with this process l load the pretrained weigh (best model) of net and not the initial random weight of net (used to initialize the network ?)

Then l did the following :
output = net(test_x[0], coord_test[0], adj_test[0], L_test[0],lmax_test[0],mask_test[0])

and l got the following error :
***** IndexError: invalid index to scalar variable.**

Thank you @ptrblck for you help

The loading looks fine!

It looks like you use a LeNet style architecture and try to call the forward pass on multiple inputs.
Could you print the shape of test_x?
Also, what are the other tensors? I assume test_y is the target.

@ptrblck, problem solved. batch_size=4 , so l need to forward 4 exemples.

I have a question for you if you don’t mind :

now l want to extract features from layer fc1. However, when l apply :

net.fc1.register_forward_hook(get_activation('fc1'))

l get always the output of fc2.

net
ConvNet_LeNet5(
(cl1): Linear(in_features=25, out_features=32, bias=True)
(cl2): Linear(in_features=25, out_features=64, bias=True)
(fc1): Linear(in_features=51200, out_features=100, bias=True)
(fc2): Linear(in_features=100, out_features=40, bias=True)
)

when l apply

net.fc1.register_forward_hook(get_activation('fc1'))
l’m supposed to vector of 100 features however l get 40.

Does you hook function

activation = {}
def get_activation(name):
    def hook(model, input, output):
        activation[name] = output.detach()
    return hook

works only for the last layer fc2 ?

Thank you

I’m not sure, why you don’t get the right activations.
The model definition looks a bit strange. If the model is designed to run as a sequential model, the shapes of the linear layers don’t match together.

In my code example the hooks run fine. Could you check, if you used the same code logic?

Hi @ptrblck,

It’s not a problem for the shpae . l use x.view() to reshape. I’m working with graph ConvNet.

Here is my init() and forward() functions

class My_net(nn.Module):

    def __init__(self, net_parameters):

        print('My_net')

        super(My_net, self).__init__()
        np.random.seed(seed)

        D, CL1_F, CL1_K, FC1_F, FC2_F, p = net_parameters

       	# Conv layer 1 
        
        self.cl1 = nn.Linear(CL1_K, CL1_F,10)
        Fin = CL1_K;
        Fout = CL1_F;
        scale = np.sqrt(2.0 / (Fin + Fout))
        self.cl1.weight.data.uniform_(-scale, scale)
        self.cl1.bias.data.fill_(0.0)
        self.CL1_K = CL1_K;
        self.CL1_F = CL1_F;


        # Conv layer 1 
        self.cl2 = nn.Linear(CL2_K, CL2_F,10)
        Fin = CL2_K * CL1_F;
        Fout = CL2_F;
        scale = np.sqrt(2.0 / (Fin + Fout))
        self.cl2.weight.data.uniform_(-scale, scale)
        self.cl2.bias.data.fill_(0.0)
        self.CL2_K = CL2_K;
        self.CL2_F = CL2_F;



        # FC1
        self.fc1 = nn.Linear(FC1Fin, FC1_F,10)
        Fin = FC1Fin;
        Fout = FC1_F;
        scale = np.sqrt(2.0 / (Fin + Fout))
        self.fc1.weight.data.uniform_(-scale, scale)
        self.fc1.bias.data.fill_(0.0)
        self.FC1Fin = FC1Fin

        # FC2
        self.fc2 = nn.Linear(FC1_F, FC2_F,10)
        Fin = FC1_F;
        Fout = FC2_F;
        scale = np.sqrt(2.0 / (Fin + Fout))
        self.fc2.weight.data.uniform_(-scale, scale)
        self.fc2.bias.data.fill_(0.0)




    def forward(self, x, d, L, lmax, coord, adj_matrix,n_nodes):



        # conv layer 1
        x = conv_layer(x, self.cl1, L, lmax, self.CL1_F, self.CL1_K,n_nodes,masked=True)
        x = F.relu(x)


	# conv layer 2
        x = conv_layer(x, self.cl1, L, lmax, self.CL1_F, self.CL1_K,n_nodes,masked=True)
        x = F.relu(x)

	# Fully connected layer 1

        x=self.fc1(x)
        x = F.relu(x)




        # Fully connected layer 2

        x = self.fc2(x)
        x = F.relu(x)

        return x