Some detailed problem about torch.load_state_dict()

Zichun_Zhang · December 12, 2018, 3:04pm

      self.featureExtract = nn.Sequential(                               # 271
            nn.Conv2d(configs[0], configs[1] , kernel_size=11, stride=2),  # 131
            nn.BatchNorm2d(configs[1]),
            nn.MaxPool2d(kernel_size=3, stride=2),    #65
            nn.ReLU(inplace=True),
            nn.Conv2d(configs[1], configs[2], kernel_size=5),   #61
            nn.BatchNorm2d(configs[2]),
            nn.MaxPool2d(kernel_size=3, stride=2),    #30
            nn.ReLU(inplace=True),
            nn.Conv2d(configs[2], configs[3], kernel_size=3), #28
            nn.BatchNorm2d(configs[3]),
            nn.ReLU(inplace=True),
            nn.Conv2d(configs[3], configs[4], kernel_size=3), #26
            nn.BatchNorm2d(configs[4]),
            nn.ReLU(inplace=True),
            nn.Conv2d(configs[4], configs[5], kernel_size=3), #24
            nn.BatchNorm2d(configs[5]),
        )

        self.conv_r1 = nn.Conv2d(feat_in, feature_out*4*anchor, 3)
        self.conv_r2 = nn.Conv2d(feat_in, feature_out, 3)
        self.conv_cls1 = nn.Conv2d(feat_in, feature_out*2*anchor, 3)
        self.conv_cls2 = nn.Conv2d(feat_in, feature_out, 3)
        self.regress_adjust = nn.Conv2d(4*anchor, 4*anchor, 1)

Above is the nn module of my defined class, now I only want to use the pre-trained parameters of the self.featureExtract part but not the later part,
so how should I do in order to load only partial parameters of the

net.load_state_dict(torch.load(net_file))

(This above is going to load all the params but I just want to load some specific part)

Amrit_Das · December 12, 2018, 3:11pm

This is a similiar kind of answer from

Zichun_Zhang · December 12, 2018, 3:43pm

thx a lot, BTW, do you know about what does “key” really mean?
Is that kind of like hash value or?

JuanFMontesinos · December 12, 2018, 4:05pm

when u have a python dictionary you acess to each element by doing dict[key]
it’s an string or integer.

models are stored in ordered dictionaries whose keys are the names of each parameter

if you do statedict[model.submodel.weight] it will return the tensor value

Zichun_Zhang · December 13, 2018, 1:35am

Thx a lot Juan,
But there is a great a lot parameters that have same names, eg. nn.Conv2d, really a lot.
So how could it know which nn.Conv2d is the right one to choose, when use this dictionary to mapping.
Thx~

JuanFMontesinos · December 13, 2018, 8:10am

They key is the name that you assign to the variable in the nn.Module, therefore

class test(torch.nn.Module):
    def __init__(self):
        super(test,self).__init__()
        self.conv1 = torch.nn.Conv2d(10,15,10)
        self.customconv = torch.nn.Conv2d(100,1000,10)

test()
Out[7]: 
test(
  (conv1): Conv2d(10, 15, kernel_size=(10, 10), stride=(1, 1))
  (customconv): Conv2d(100, 1000, kernel_size=(10, 10), stride=(1, 1))
)

if you assigned proper names
you will get a tractable state dict
in this case

q.keys()
Out[17]: odict_keys(['conv1.weight', 'conv1.bias', 'customconv.weight', 'customconv.bias'])

since I assigned different names i get different keys.
if you always use conv1 you will get a wrong and not functional state dict

Zichun_Zhang · December 13, 2018, 8:12am

thx a lot!!!

Jimmy_Xiaoke_Shen · May 8, 2020, 6:22pm

Thanks a lot. In “q.keys()”, what is the meaning of ‘q’. Thanks.

JuanFMontesinos · May 8, 2020, 6:27pm

Dont remember, a local variable i used to store some dict for showing.

Jimmy_Xiaoke_Shen · May 8, 2020, 6:34pm

Thanks for your quick reply. I am transferring from tensorflow to pytorch recently and I am still learning. q should be the state_dict of the model defined in the test. I will try to explore this part. Thanks Juan.

Jimmy_Xiaoke_Shen · May 8, 2020, 6:50pm

I figured it out. It is interesting that the state_dict has no relation to the forward function. I thought the model is defined by the forward function.

import torch
import torch.nn as nn
import torch.nn.functional as F
class test(torch.nn.Module):
    def __init__(self):
        super(test,self).__init__()
        self.conv1 = torch.nn.Conv2d(10,15,10)
        self.customconv = torch.nn.Conv2d(100,1000,10)
    """
    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.suctomconv(x)))
        x = self.pool(F.relu(self.suctomconv(x)))
        return x
    """
model = test()
q = model.state_dict()
print(q.keys())

JuanFMontesinos · May 8, 2020, 9:07pm

Hi,
The model is “defined” in the forward function. However, pytorch assumes that if you define some weights it means you are gonna use them and saves them in the state_dict.

Forward define how those weights are connected

Jimmy_Xiaoke_Shen · May 8, 2020, 9:27pm

Thanks Juan. It’s interesting to know this. Then I assume
if we have the following forward, the graph defined here will go through customconv two times, it is something like the following:
CONV1 → CUSTOMCONV1 → CUSTOMCONV1(SAME as previous one)
instead of
CONV1 → CUSTOMCONV1 → CUSTOMCONV1(DIFFERENT as previous one)

def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.customconv(x)))
        x = self.pool(F.relu(self.customconv(x)))
        return x

If so, all parameters are defined in __init__ and in the forward, we just use them. It is possible that parameters used in forward is a subset of the parameters defined in __init__.
For those not used one, we did not train them, and the model will be finally saved as the initial random weights.
I am not sure whether I am understanding this part correctly.

JuanFMontesinos · May 8, 2020, 11:32pm

Yes that’s more or less the idea.
So customconv will be updated given the contribution of both runs.

Jimmy_Xiaoke_Shen · May 8, 2020, 11:44pm

Thanks a lot. Your explanation makes me much clear about the relationship between the init and forward. Also I am pretty clear about how to define the model through the forward function.
It is great. Thanks.