Some detailed problem about torch.load_state_dict()

      self.featureExtract = nn.Sequential(                               # 271
            nn.Conv2d(configs[0], configs[1] , kernel_size=11, stride=2),  # 131
            nn.BatchNorm2d(configs[1]),
            nn.MaxPool2d(kernel_size=3, stride=2),    #65
            nn.ReLU(inplace=True),
            nn.Conv2d(configs[1], configs[2], kernel_size=5),   #61
            nn.BatchNorm2d(configs[2]),
            nn.MaxPool2d(kernel_size=3, stride=2),    #30
            nn.ReLU(inplace=True),
            nn.Conv2d(configs[2], configs[3], kernel_size=3), #28
            nn.BatchNorm2d(configs[3]),
            nn.ReLU(inplace=True),
            nn.Conv2d(configs[3], configs[4], kernel_size=3), #26
            nn.BatchNorm2d(configs[4]),
            nn.ReLU(inplace=True),
            nn.Conv2d(configs[4], configs[5], kernel_size=3), #24
            nn.BatchNorm2d(configs[5]),
        )

        self.conv_r1 = nn.Conv2d(feat_in, feature_out*4*anchor, 3)
        self.conv_r2 = nn.Conv2d(feat_in, feature_out, 3)
        self.conv_cls1 = nn.Conv2d(feat_in, feature_out*2*anchor, 3)
        self.conv_cls2 = nn.Conv2d(feat_in, feature_out, 3)
        self.regress_adjust = nn.Conv2d(4*anchor, 4*anchor, 1)

Above is the nn module of my defined class, now I only want to use the pre-trained parameters of the self.featureExtract part but not the later part,
so how should I do in order to load only partial parameters of the

net.load_state_dict(torch.load(net_file))

(This above is going to load all the params but I just want to load some specific part)

This is a similiar kind of answer from

1 Like

thx a lot, BTW, do you know about what does “key” really mean?
Is that kind of like hash value or?

when u have a python dictionary you acess to each element by doing dict[key]
it’s an string or integer.

models are stored in ordered dictionaries whose keys are the names of each parameter

if you do statedict[model.submodel.weight] it will return the tensor value

1 Like

Thx a lot Juan,
But there is a great a lot parameters that have same names, eg. nn.Conv2d, really a lot.
So how could it know which nn.Conv2d is the right one to choose, when use this dictionary to mapping.
Thx~

They key is the name that you assign to the variable in the nn.Module, therefore

class test(torch.nn.Module):
    def __init__(self):
        super(test,self).__init__()
        self.conv1 = torch.nn.Conv2d(10,15,10)
        self.customconv = torch.nn.Conv2d(100,1000,10)

test()
Out[7]: 
test(
  (conv1): Conv2d(10, 15, kernel_size=(10, 10), stride=(1, 1))
  (customconv): Conv2d(100, 1000, kernel_size=(10, 10), stride=(1, 1))
)

if you assigned proper names
you will get a tractable state dict
in this case

q.keys()
Out[17]: odict_keys(['conv1.weight', 'conv1.bias', 'customconv.weight', 'customconv.bias'])

since I assigned different names i get different keys.
if you always use conv1 you will get a wrong and not functional state dict

1 Like

thx a lot!!!:+1::+1::+1::+1:

Thanks a lot. In “q.keys()”, what is the meaning of ‘q’. Thanks.

Dont remember, a local variable i used to store some dict for showing.

1 Like

Thanks for your quick reply. I am transferring from tensorflow to pytorch recently and I am still learning. q should be the state_dict of the model defined in the test. I will try to explore this part. Thanks Juan.

I figured it out. It is interesting that the state_dict has no relation to the forward function. I thought the model is defined by the forward function.

import torch
import torch.nn as nn
import torch.nn.functional as F
class test(torch.nn.Module):
    def __init__(self):
        super(test,self).__init__()
        self.conv1 = torch.nn.Conv2d(10,15,10)
        self.customconv = torch.nn.Conv2d(100,1000,10)
    """
    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.suctomconv(x)))
        x = self.pool(F.relu(self.suctomconv(x)))
        return x
    """
model = test()
q = model.state_dict()
print(q.keys())

Hi,
The model is “defined” in the forward function. However, pytorch assumes that if you define some weights it means you are gonna use them and saves them in the state_dict.

Forward define how those weights are connected

1 Like

Thanks Juan. It’s interesting to know this. Then I assume
if we have the following forward, the graph defined here will go through customconv two times, it is something like the following:
CONV1 → CUSTOMCONV1 → CUSTOMCONV1(SAME as previous one)
instead of
CONV1 → CUSTOMCONV1 → CUSTOMCONV1(DIFFERENT as previous one)

def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.customconv(x)))
        x = self.pool(F.relu(self.customconv(x)))
        return x

If so, all parameters are defined in __init__ and in the forward, we just use them. It is possible that parameters used in forward is a subset of the parameters defined in __init__.
For those not used one, we did not train them, and the model will be finally saved as the initial random weights.
I am not sure whether I am understanding this part correctly.

Yes that’s more or less the idea.
So customconv will be updated given the contribution of both runs.

1 Like

Thanks a lot. Your explanation makes me much clear about the relationship between the init and forward. Also I am pretty clear about how to define the model through the forward function.
It is great. Thanks.