Some detailed problem about torch.load_state_dict()

      self.featureExtract = nn.Sequential(                               # 271
            nn.Conv2d(configs[0], configs[1] , kernel_size=11, stride=2),  # 131
            nn.MaxPool2d(kernel_size=3, stride=2),    #65
            nn.Conv2d(configs[1], configs[2], kernel_size=5),   #61
            nn.MaxPool2d(kernel_size=3, stride=2),    #30
            nn.Conv2d(configs[2], configs[3], kernel_size=3), #28
            nn.Conv2d(configs[3], configs[4], kernel_size=3), #26
            nn.Conv2d(configs[4], configs[5], kernel_size=3), #24

        self.conv_r1 = nn.Conv2d(feat_in, feature_out*4*anchor, 3)
        self.conv_r2 = nn.Conv2d(feat_in, feature_out, 3)
        self.conv_cls1 = nn.Conv2d(feat_in, feature_out*2*anchor, 3)
        self.conv_cls2 = nn.Conv2d(feat_in, feature_out, 3)
        self.regress_adjust = nn.Conv2d(4*anchor, 4*anchor, 1)

Above is the nn module of my defined class, now I only want to use the pre-trained parameters of the self.featureExtract part but not the later part,
so how should I do in order to load only partial parameters of the


(This above is going to load all the params but I just want to load some specific part)

This is a similiar kind of answer from

1 Like

thx a lot, BTW, do you know about what does “key” really mean?
Is that kind of like hash value or?

when u have a python dictionary you acess to each element by doing dict[key]
it’s an string or integer.

models are stored in ordered dictionaries whose keys are the names of each parameter

if you do statedict[model.submodel.weight] it will return the tensor value

1 Like

Thx a lot Juan,
But there is a great a lot parameters that have same names, eg. nn.Conv2d, really a lot.
So how could it know which nn.Conv2d is the right one to choose, when use this dictionary to mapping.

They key is the name that you assign to the variable in the nn.Module, therefore

class test(torch.nn.Module):
    def __init__(self):
        self.conv1 = torch.nn.Conv2d(10,15,10)
        self.customconv = torch.nn.Conv2d(100,1000,10)

  (conv1): Conv2d(10, 15, kernel_size=(10, 10), stride=(1, 1))
  (customconv): Conv2d(100, 1000, kernel_size=(10, 10), stride=(1, 1))

if you assigned proper names
you will get a tractable state dict
in this case

Out[17]: odict_keys(['conv1.weight', 'conv1.bias', 'customconv.weight', 'customconv.bias'])

since I assigned different names i get different keys.
if you always use conv1 you will get a wrong and not functional state dict

1 Like

thx a lot!!!:+1::+1::+1::+1:

Thanks a lot. In “q.keys()”, what is the meaning of ‘q’. Thanks.

Dont remember, a local variable i used to store some dict for showing.

1 Like

Thanks for your quick reply. I am transferring from tensorflow to pytorch recently and I am still learning. q should be the state_dict of the model defined in the test. I will try to explore this part. Thanks Juan.

I figured it out. It is interesting that the state_dict has no relation to the forward function. I thought the model is defined by the forward function.

import torch
import torch.nn as nn
import torch.nn.functional as F
class test(torch.nn.Module):
    def __init__(self):
        self.conv1 = torch.nn.Conv2d(10,15,10)
        self.customconv = torch.nn.Conv2d(100,1000,10)
    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.suctomconv(x)))
        x = self.pool(F.relu(self.suctomconv(x)))
        return x
model = test()
q = model.state_dict()

The model is “defined” in the forward function. However, pytorch assumes that if you define some weights it means you are gonna use them and saves them in the state_dict.

Forward define how those weights are connected

1 Like

Thanks Juan. It’s interesting to know this. Then I assume
if we have the following forward, the graph defined here will go through customconv two times, it is something like the following:
instead of

def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.customconv(x)))
        x = self.pool(F.relu(self.customconv(x)))
        return x

If so, all parameters are defined in __init__ and in the forward, we just use them. It is possible that parameters used in forward is a subset of the parameters defined in __init__.
For those not used one, we did not train them, and the model will be finally saved as the initial random weights.
I am not sure whether I am understanding this part correctly.

Yes that’s more or less the idea.
So customconv will be updated given the contribution of both runs.

1 Like

Thanks a lot. Your explanation makes me much clear about the relationship between the init and forward. Also I am pretty clear about how to define the model through the forward function.
It is great. Thanks.